Jump to content

Research talk:Onboarding new Wikipedians/Rollout/Work log/2014-03-26

Add topic
From Meta, a Wikimedia project coordination wiki

Wednesday, March 26th

[edit]

So, I made a mistake in one of my queries last time. This will address the anomaly that I was chasing yesterday. The problem was that, when I looked for gettingstarted tagged edits, I forgot to specify which tag, so I was matching all tagged edits. Here's the updated query:

SELECT
    DATABASE() as wiki,
    user_id,
    SUM(rev_id IS NOT NULL) AS day_revisions,
    SUM(rev_id IS NOT NULL AND page_namespace = 0) AS day_main_revisions,
    SUM(ct_rev_id IS NOT NULL) AS day_gs_revisions
FROM user
LEFT JOIN revision ON
    rev_user = user_id AND
    rev_timestamp BETWEEN 
        user_registration AND 
        DATE_FORMAT(DATE_ADD(user_registration, INTERVAL 1 DAY), "%Y%m%d%H%i%S")
LEFT JOIN page ON
    rev_page = page_id
LEFT JOIN change_tag ON
    ct_rev_id = rev_id AND
    ct_tag = "gettingstarted edit" /* Here is the new line */
WHERE
    user_registration BETWEEN "20140211181300" AND "20140313181300"
GROUP BY 1,2;

I just kicked off the query cross the GS wiki DBs and will update the funnel figure next. --Halfak (WMF) (talk) 17:06, 26 March 2014 (UTC)Reply


> SELECT
    ->     cta_type,
    ->     COUNT(*) AS users,
    ->     SUM(day_gs_revisions > 0) AS gs_editor,
    ->     SUM(day_gs_revisions > 0)/COUNT(*) AS proportion
    -> FROM month_gs_user
    -> INNER JOIN gs_user_day_edits USING (wiki, user_id)
    -> WHERE day_revisions > 0
    -> GROUP BY 1;
+---------------------------+-------+-----------+------------+
| cta_type                  | users | gs_editor | proportion |
+---------------------------+-------+-----------+------------+
| NULL                      | 22514 |       171 |     0.0076 |
| edit current              |  6062 |      2922 |     0.4820 |
| edit current or suggested | 19236 |     11641 |     0.6052 |
| suggested                 | 18338 |      1585 |     0.0864 |
+---------------------------+-------+-----------+------------+
4 rows in set (5.47 sec)

That makes way more sense! --Halfak (WMF) (talk) 19:51, 26 March 2014 (UTC)Reply