Research talk:Anonymous editor acquisition/Signup CTA experiment/Work log/2014-06-18

From Meta, a Wikimedia project coordination wiki

Wednesday, June 18th[edit]

Today I get to work on the funnel. Woo. I think I'll start by looking at this figure:

The event flow for the two experimental conditions and the control is presented.
Event flow. The event flow for the two experimental conditions and the control is presented.

To help me think about this, I'm going to talk through the diagram. All flows start with clicking edit. In the pre-edit case, we expect an impression and potentially a button click. Next we might see either an account creation impression/creation or a revision saved. Now this is where the post-edit case does it's interesting stuff. We expect that an impression will happen shortly after the saved edit and potentially a button click. This button click might lead to an account creation impression/creation and that's the end.

OK. So I think that my plan is to identify a set of "flows" that start with edit link clicks and show the proportion of those flows that travel the various paths through the diagram above. One potential problem with assuming that every edit lick click is a flow start is that an edit link click is an acceptable action when presented with the CTA. I'm going to have to set some threshold for when I'll consider an edit link click to be the start of a flow. --Halfak (WMF) (talk) 15:01, 18 June 2014 (UTC)[reply]


CREATE TEMPORARY TABLE staging.token_edit_clicks
SELECT
    wiki,
    event_token AS token,
    timestamp
FROM log.SignupExpPageLinkClick_8101692
WHERE
    event_link LIKE "edit%" AND
    timestamp BETWEEN "20140519180800" AND "20140526180800";
ALTER TABLE staging.token_edit_clicks MODIFY wiki VARCHAR(50);
CREATE INDEX wiki_token ON staging.token_edit_clicks (wiki, token);

CREATE TEMPORARY TABLE staging.token_edit_clicks_copy
SELECT * FROM staging.token_edit_clicks;

CREATE TEMPORARY TABLE staging.token_flow_start_pre
SELECT
    tec.wiki,
    tec.token,
    tec.timestamp
FROM staging.token_edit_clicks_copy tec
LEFT JOIN staging.token_edit_clicks recent_clicks ON
    tec.wiki = recent_clicks.wiki AND
    tec.token = recent_clicks.token AND
    recent_clicks.timestamp BETWEEN 
        DATE_FORMAT(DATE_SUB(tec.timestamp, INTERVAL 5 MINUTE), "%Y%m%d%H%i%S") AND
        DATE_FORMAT(DATE_SUB(tec.timestamp, INTERVAL 1 SECOND), "%Y%m%d%H%i%S")
WHERE recent_clicks.token IS NULL;
CREATE INDEX wiki_token_pre ON staging.token_flow_start_pre (wiki, token);


CREATE TEMPORARY TABLE staging.token_flow_start_pre_copy
SELECT * FROM staging.token_flow_start_pre;

CREATE TABLE staging.token_flow_start
SELECT
    tfs.wiki,
    tfs.token,
    tfs.timestamp,
    MIN(next_flow.timestamp) AS next_flow_start
FROM staging.token_flow_start_pre_copy tfs
LEFT JOIN staging.token_flow_start_pre next_flow ON 
    tfs.wiki = next_flow.wiki AND
    tfs.token = next_flow.token AND
    next_flow.timestamp > tfs.timestamp
GROUP BY 1,2,3;
CREATE INDEX wiki_token ON staging.token_flow_start (wiki, token);
> select count(*) from staging.token_flow_start;
+----------+
| count(*) |
+----------+
|   791866 |
+----------+
1 row in set (0.00 sec)

There. Now I have a table with 791,866 clicks on edit that started a flow. Now, all I need to do is get a count of unique events for each token flow and I can start plotting. --Halfak (WMF) (talk) 15:56, 18 June 2014 (UTC)[reply]


Now the query to gather all of the first timestamps of each event type within a flow.

CREATE TEMPORARY TABLE staging.experimental_wiki_token_events
    SELECT
      wiki,
      event_token AS token,
      timestamp,
      "revision" AS event
    FROM log.TrackedPageContentSaveComplete_8535426
    WHERE wiki IN ("enwiki", "dewiki", "itwiki", "frwiki")
    AND event_token IS NOT NULL
  UNION ALL
    SELECT
      wiki,
      event_token AS token,
      timestamp,
      "revision" AS event
    FROM log.TrackedPageContentSaveComplete_7872558
    WHERE wiki IN  ("enwiki", "dewiki", "itwiki", "frwiki")
    AND event_token IS NOT NULL
  UNION ALL
    SELECT
      wiki,
      event_token AS token,
      timestamp,
      "creation complete" AS event
    FROM log.SignupExpAccountCreationComplete_8539421
    WHERE wiki IN ("enwiki", "dewiki", "itwiki", "frwiki")
    AND event_token IS NOT NULL
  UNION ALL
    SELECT
      wiki,
      event_token AS token,
      timestamp,
      "creation impression" AS event
    FROM log.SignupExpAccountCreationImpression_8539445
    WHERE wiki IN ("enwiki", "dewiki", "itwiki", "frwiki")
    AND event_token IS NOT NULL
  UNION ALL
    SELECT
      wiki,
      event_token AS token,
      timestamp,
      CONCAT("button click ", event_button) AS event
    FROM log.SignupExpCTAButtonClick_8102619
    WHERE wiki IN ("enwiki", "dewiki", "itwiki", "frwiki")
    AND event_token IS NOT NULL
  UNION ALL
    SELECT
      wiki,
      event_token AS token,
      timestamp,
      CONCAT("CTA impression ", event_cta) AS event
    FROM log.SignupExpCTAImpression_8101716
    WHERE wiki IN ("enwiki", "dewiki", "itwiki", "frwiki")
    AND event_token IS NOT NULL
  UNION ALL
    SELECT
      wiki,
      event_token AS token,
      timestamp,
      CONCAT("link click ", event_link) AS event
    FROM log.SignupExpPageLinkClick_8101692
    WHERE wiki IN ("enwiki", "dewiki", "itwiki", "frwiki")
    AND event_token IS NOT NULL;
ALTER TABLE staging.experimental_wiki_token_events MODIFY wiki VARCHAR(50);
CREATE INDEX wiki_token_time ON staging.experimental_wiki_token_events (wiki, token, timestamp);

SELECT
    tfs.wiki,
    tfs.token,
    tfs.timestamp AS flow_start,
    MAX(ewte.timestamp) AS last_event,
    MIN(IF(ewte.event = "link click edit section", ewte.timestamp, NULL)) AS first_edit_section_click,
    SUM(ewte.event = "link click edit section") AS edit_section_clicks,
    MIN(IF(ewte.event = "link click edit page", ewte.timestamp, NULL)) AS first_edit_page_click,
    SUM(ewte.event = "link click edit page") AS edit_page_clicks,
    MIN(IF(ewte.event = "link click create account", ewte.timestamp, NULL)) AS first_create_account_click,
    SUM(ewte.event = "link click create account") AS create_account_clicks,
    MIN(IF(ewte.event = "CTA impression pre-edit", ewte.timestamp, NULL)) AS first_pre_edit_cta_impression,
    SUM(ewte.event = "CTA impression pre-edit") AS pre_edit_cta_impressions,
    MIN(IF(ewte.event = "CTA impression post-edit", ewte.timestamp, NULL)) AS first_post_edit_cta_impression,
    SUM(ewte.event = "CTA impression post-edit") AS post_edit_cta_impressions,
    MIN(IF(ewte.event = "button click edit", ewte.timestamp, NULL)) AS first_edit_button_click,
    SUM(ewte.event = "button click edit") AS edit_button_clicks,
    MIN(IF(ewte.event = "button click signup", ewte.timestamp, NULL)) AS first_signup_button_click,
    SUM(ewte.event = "button click signup") AS signup_button_clicks,
    MIN(IF(ewte.event = "button click dismiss", ewte.timestamp, NULL)) AS first_dismiss_button_click,
    SUM(ewte.event = "button click dismiss") AS dismiss_button_clicks,
    MIN(IF(ewte.event = "creation impression", ewte.timestamp, NULL)) AS first_account_creation_impression,
    SUM(ewte.event = "creation impression") AS account_creation_impressions,
    MIN(IF(ewte.event = "creation complete", ewte.timestamp, NULL)) AS first_account_creation_complete,
    SUM(ewte.event = "creation complete") AS account_creation_completes,
    MIN(IF(ewte.event = "revision", ewte.timestamp, NULL)) AS first_revision_saved,
    SUM(ewte.event = "revision") AS revisions_saved
FROM staging.token_flow_start tfs
LEFT JOIN staging.experimental_wiki_token_events ewte ON 
    tfs.wiki = ewte.wiki AND
    tfs.token = ewte.token AND
    ewte.timestamp BETWEEN 
        tfs.timestamp AND 
        IFNULL(next_flow_start, DATE_FORMAT(DATE_ADD(ewte.timestamp, INTERVAL 10 MINUTE), "%Y%m%d%H%i%S"))
GROUP BY 1,2,3;

--Halfak (WMF) (talk) 18:52, 18 June 2014 (UTC)[reply]


Got it. I'll be picking this up tomorrow:

     wiki                                     token          flow_start       
 dewiki:107935   cDhV7L1uEBDZkN8lzisckNgGopYwm6KI:   194   Min.   :2.014e+13  
 enwiki:539982   cnkoRoKZHR1xutv11c4jkE1ozNMy9QwU:   155   1st Qu.:2.014e+13  
 frwiki: 85553   b9hJUP4de7uGJvDJxveDgx456ruaSolo:   142   Median :2.014e+13  
 itwiki: 58392   m0sxhkCF1X4MXdH4aS2kKYyAj8SFG3qs:   129   Mean   :2.014e+13  
 wiki  :     4   8iXmZJIUV0BdGMlPro4b6h4e8qLgLGQ5:   127   3rd Qu.:2.014e+13  
                 SqkExHP8DGdGfC6xq8EvEC29liI54ttv:   108   Max.   :2.014e+13  
                 (Other)                         :791011                      
   last_event        first_edit_section_click edit_section_clicks
 Min.   :2.014e+13   Min.   :2.014e+13        Min.   :   0.0000  
 1st Qu.:2.014e+13   1st Qu.:2.014e+13        1st Qu.:   0.0000  
 Median :2.014e+13   Median :2.014e+13        Median :   1.0000  
 Mean   :2.014e+13   Mean   :2.014e+13        Mean   :   0.8334  
 3rd Qu.:2.014e+13   3rd Qu.:2.014e+13        3rd Qu.:   1.0000  
 Max.   :2.014e+13   Max.   :2.014e+13        Max.   :2061.0000  
 NA's   :4           NA's   :263461           NA's   :4          
 first_edit_page_click edit_page_clicks   first_create_account_click
 Min.   :2.014e+13     Min.   :  0.0000   Min.   :2.014e+13         
 1st Qu.:2.014e+13     1st Qu.:  0.0000   1st Qu.:2.014e+13         
 Median :2.014e+13     Median :  0.0000   Median :2.014e+13         
 Mean   :2.014e+13     Mean   :  0.5213   Mean   :2.014e+13         
 3rd Qu.:2.014e+13     3rd Qu.:  1.0000   3rd Qu.:2.014e+13         
 Max.   :2.014e+13     Max.   :351.0000   Max.   :2.014e+13         
 NA's   :495800        NA's   :4          NA's   :784576            
 create_account_clicks first_pre_edit_cta_impression pre_edit_cta_impressions
 Min.   : 0.00000      Min.   :2.014e+13             Min.   :  0.0000        
 1st Qu.: 0.00000      1st Qu.:2.014e+13             1st Qu.:  0.0000        
 Median : 0.00000      Median :2.014e+13             Median :  0.0000        
 Mean   : 0.01129      Mean   :2.014e+13             Mean   :  0.3871        
 3rd Qu.: 0.00000      3rd Qu.:2.014e+13             3rd Qu.:  1.0000        
 Max.   :70.00000      Max.   :2.014e+13             Max.   :112.0000        
 NA's   :4             NA's   :539853                NA's   :4               
 first_post_edit_cta_impression post_edit_cta_impressions
 Min.   :2.014e+13              Min.   :  0.00000        
 1st Qu.:2.014e+13              1st Qu.:  0.00000        
 Median :2.014e+13              Median :  0.00000        
 Mean   :2.014e+13              Mean   :  0.08129        
 3rd Qu.:2.014e+13              3rd Qu.:  0.00000        
 Max.   :2.014e+13              Max.   :123.00000        
 NA's   :751294                 NA's   :4                
 first_edit_button_click edit_button_clicks first_signup_button_click
 Min.   :2.014e+13       Min.   : 0.00000   Min.   :2.014e+13        
 1st Qu.:2.014e+13       1st Qu.: 0.00000   1st Qu.:2.014e+13        
 Median :2.014e+13       Median : 0.00000   Median :2.014e+13        
 Mean   :2.014e+13       Mean   : 0.07747   Mean   :2.014e+13        
 3rd Qu.:2.014e+13       3rd Qu.: 0.00000   3rd Qu.:2.014e+13        
 Max.   :2.014e+13       Max.   :75.00000   Max.   :2.014e+13        
 NA's   :746664          NA's   :4          NA's   :770233           
 signup_button_clicks first_dismiss_button_click dismiss_button_clicks
 Min.   : 0.00000     Min.   :2.014e+13          Min.   :  0.0000     
 1st Qu.: 0.00000     1st Qu.:2.014e+13          1st Qu.:  0.0000     
 Median : 0.00000     Median :2.014e+13          Median :  0.0000     
 Mean   : 0.02885     Mean   :2.014e+13          Mean   :  0.2064     
 3rd Qu.: 0.00000     3rd Qu.:2.014e+13          3rd Qu.:  0.0000     
 Max.   :14.00000     Max.   :2.014e+13          Max.   :728.0000     
 NA's   :4            NA's   :644257             NA's   :4            
 first_account_creation_impression account_creation_impressions
 Min.   :2.014e+13                 Min.   :  0.0000            
 1st Qu.:2.014e+13                 1st Qu.:  0.0000            
 Median :2.014e+13                 Median :  0.0000            
 Mean   :2.014e+13                 Mean   :  0.0763            
 3rd Qu.:2.014e+13                 3rd Qu.:  0.0000            
 Max.   :2.014e+13                 Max.   :345.0000            
 NA's   :758648                    NA's   :4                   
 first_account_creation_complete account_creation_completes
 Min.   :2.014e+13               Min.   : 0.00000          
 1st Qu.:2.014e+13               1st Qu.: 0.00000          
 Median :2.014e+13               Median : 0.00000          
 Mean   :2.014e+13               Mean   : 0.01617          
 3rd Qu.:2.014e+13               3rd Qu.: 0.00000          
 Max.   :2.014e+13               Max.   :73.00000          
 NA's   :779901                  NA's   :4                 
 first_revision_saved revisions_saved   
 Min.   :2.014e+13    Min.   :   0.000  
 1st Qu.:2.014e+13    1st Qu.:   0.000  
 Median :2.014e+13    Median :   0.000  
 Mean   :2.014e+13    Mean   :   0.956  
 3rd Qu.:2.014e+13    3rd Qu.:   0.000  
 Max.   :2.014e+13    Max.   :4122.000  
 NA's   :641494       NA's   :4