Research talk:Anonymous editor acquisition/Signup CTA experiment/Work log/2014-06-05

From Meta, a Wikimedia project coordination wiki

Thursday, June 5th[edit]

I've taken a couple of days off to catch up on some academic work. I'm far more competent when talking about the history of Theory in Computer Supported Cooperative Work and I finished building and testing a hierarchical difference strategy for another project[1].

Anyway, while I was away I generated some stats for the experimental users. This is my normal set of stats that I generate for measuring new user performance.

The table:

mysql:research@s1-analytics-slave.eqiad.wmnet [staging]> explain experimental_user_stats;
+------------------------------+---------------+------+-----+---------+-------+
| Field                        | Type          | Null | Key | Default | Extra |
+------------------------------+---------------+------+-----+---------+-------+
| wiki                         | varchar(50)   | NO   | PRI |         |       |
| bucket                       | varchar(15)   | YES  |     | NULL    |       |
| first_event                  | varbinary(14) | YES  |     | NULL    |       |
| user_id                      | int(11)       | NO   | PRI | 0       |       |
| user_registration            | varbinary(14) | YES  |     | NULL    |       |
| day_revisions                | int(11)       | YES  |     | NULL    |       |
| day_main_revisions           | int(11)       | YES  |     | NULL    |       |
| day_reverted_main_revisions  | int(11)       | YES  |     | NULL    |       |
| day_productive_edits         | int(11)       | YES  |     | NULL    |       |
| week_revisions               | int(11)       | YES  |     | NULL    |       |
| week_main_revisions          | int(11)       | YES  |     | NULL    |       |
| week_reverted_main_revisions | int(11)       | YES  |     | NULL    |       |
| week_sessions                | int(11)       | YES  |     | NULL    |       |
| week_session_seconds         | int(11)       | YES  |     | NULL    |       |
| week_productive_edits        | int(11)       | YES  |     | NULL    |       |
+------------------------------+---------------+------+-----+---------+-------+
15 rows in set (0.00 sec)

Some example data.

mysql:research@s1-analytics-slave.eqiad.wmnet [staging]> select wiki, bucket, user_id, day_main_revisions from experimental_user_stats limit 3;
+--------+-----------+---------+--------------------+
| wiki   | bucket    | user_id | day_main_revisions |
+--------+-----------+---------+--------------------+
| dewiki | control   | 1870785 |                  0 |
| dewiki | post-edit | 1872620 |                  7 |
| dewiki | post-edit | 1872288 |                  2 |
+--------+-----------+---------+--------------------+
3 rows in set (0.00 sec)

Now, let's look at activation rates.

mysql:research@s1-analytics-slave.eqiad.wmnet [staging]> select wiki, bucket, sum(day_main_revisions > 0) AS one_plus_editors, sum(day_main_revisions >= 5) as five_plus_editors, count(*) as total_users from experimental_user_stats group by 1,2;
+--------+-----------+------------------+-------------------+---------------+
| wiki   | bucket    | one_plus_editors | five_plus_editors | total_users   |
+--------+-----------+------------------+-------------------+---------------+
| dewiki | control   |              166 |                11 |           415 |
| dewiki | post-edit |              183 |                27 |           413 |
| dewiki | pre-edit  |              296 |                31 |           660 |
| enwiki | control   |             1840 |               283 |          6171 |
| enwiki | post-edit |             1808 |               319 |          5922 |
| enwiki | pre-edit  |             3637 |               451 |          8768 |
| frwiki | control   |              201 |                33 |           619 |
| frwiki | post-edit |              235 |                32 |           702 |
| frwiki | pre-edit  |              492 |                43 |          1037 |
| itwiki | control   |               89 |                15 |           232 |
| itwiki | post-edit |              102 |                15 |           285 |
| itwiki | pre-edit  |              266 |                40 |           465 |
+--------+-----------+------------------+-------------------+---------------+
12 rows in set (0.08 sec)

Now for checking significance.

dewiki
  • Pre-edit vs. control
    • 1+ prop.test(c(166, 296), c(415, 660)) (x^2=2.25, p=0.134)
    • 5+ prop.test(c(11, 31), c(415, 660)) (x^2=2.32, p=0.128)
  • Post-edit vs. control
    • 1+ prop.test(c(166, 183), c(415, 413)) (x^2=1.41, p=0.240)
    • 5+ prop.test(c(11, 27), c(415, 413)) (x^2=6.28, p=0.012)
enwiki
  • Pre-edit vs. control
    • 1+ prop.test(c(1840, 3637), c(6171, 8768)) (x^2=211.68, p<0.001)
    • 5+ prop.test(c(283, 451), c(6171, 8768)) (x^2=2.29, p=0.130)
  • Post-edit vs. control
    • 1+ prop.test(c(1840, 1808), c(6171, 5922)) (x^2=0.70, p=0.404)
    • 5+ prop.test(c(283, 319), c(6171, 5922)) (x^2=3.93, p=0.047)
frwiki
  • Pre-edit vs. control
    • 1+ prop.test(c(201, 492), c(619, 1037)) (x^2=35.10, p<0.001)
    • 5+ prop.test(c(33, 43), c(619, 1037)) (x^2=0.99, p=0.32)
  • Post-edit vs. control
    • 1+ prop.test(c(201, 235), c(619, 702)) (x^2=0.11, p=0.742)
    • 5+ prop.test(c(33, 32), c(619, 702)) (x^2=0.27, p=0.603)
itwiki
  • Pre-edit vs. control
    • 1+ prop.test(c(89, 266), c(232, 465)) (x^2=21.24, p<0.001)
    • 5+ prop.test(c(15, 40), c(232, 465)) (x^2=0.70, p=0.402)
  • Post-edit vs. control
    • 1+ prop.test(c(89, 102), c(232, 285)) (x^2=0.26, p=0.609)
    • 5+ prop.test(c(15, 15), c(232, 285)) (x^2=0.15, p=0.695)

--Halfak (WMF) (talk) 14:22, 5 June 2014 (UTC)[reply]


How about overall?

mysql:halfak@db1047.eqiad.wmnet [staging]> select bucket, sum(day_main_revisions > 0) AS one_plus_editors, sum(day_main_revisions >= 5) as five_plus_editors, count(*) as total_users from experimental_user_stats group by 1;
+-----------+------------------+-------------------+-------------+
| bucket    | one_plus_editors | five_plus_editors | total_users |
+-----------+------------------+-------------------+-------------+
| control   |             2296 |               342 |        7437 |
| pre-edit  |             4691 |               565 |       10930 |
| post-edit |             2328 |               393 |        7322 |
+-----------+------------------+-------------------+-------------+
3 rows in set (0.08 sec)
  • Pre-edit vs. control
    • 1+ prop.test(c(2296, 4691), c(7437, 10930)) (x^2=271.95, p<0.001)
    • 5+ prop.test(c(342, 565), c(7437, 10930)) (x^2=2.95, p=0.086)
  • Post-edit vs. control
    • 1+ prop.test(c(2296, 2328), c(7437, 7322)) (x^2=1.42, p=0.234)
    • 5+ prop.test(c(342, 393), c(7437, 7322)) (x^2=4.45, p=0.035)

--Halfak (WMF) (talk) 14:38, 5 June 2014 (UTC)[reply]