Research:Anonymous editor acquisition/Volume and impact

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search
Nutshell.png
This page in a nutshell: There are about 100,000 anonymous (IP) editors a month on English Wikipedia, and they make roughly one third of the more than three million edits/month. The average IP makes just one or two edits a month, within a 15 minute time period.

This page documents quantitative research in the volume of anonymous editors on English Wikipedia.

Unique IPs editing[edit]

The number of revisions saved per month is plotted by the type of agent.
The monthly edit session hours are plotted by user type
The number of monthly edit sessions is plotted by the type of agent.
The mean number of hours per session is plotted by the type of agent.
The mean number of revisions per session is plotted by the type of agent.

Main namespace activity[edit]

Unique editors. The number of unique editors (by user_text) is plotted for two agent types (anon = IP editors, user = registered account) over time.
data table
    year month  anons  users
1   2002     1    846    179
2   2002     2   1297    261
3   2002     3   1301    245
4   2002     4   1225    216
5   2002     5   1194    205
6   2002     6   1159    256
7   2002     7   1346    287
8   2002     8   1974    382
9   2002     9   2639    478
10  2002    10   2744    553
11  2002    11   2906    537
12  2002    12   3115    584
13  2003     1   3783    892
14  2003     2   2981    894
15  2003     3   3476    859
16  2003     4   3762    836
17  2003     5   4678   1131
18  2003     6   4998   1289
19  2003     7   6184   1662
20  2003     8   7171   1792
21  2003     9   7895   1832
22  2003    10   8408   2127
23  2003    11   8629   2268
24  2003    12   9679   2441
25  2004     1  10812   2855
26  2004     2  14622   3819
27  2004     3  19947   5000
28  2004     4  19458   5185
29  2004     5  20741   5680
30  2004     6  22244   5955
31  2004     7  27033   7057
32  2004     8  27312   7709
33  2004     9  34514   8808
34  2004    10  41571  10125
35  2004    11  51141  11618
36  2004    12  55204  12564
37  2005     1  59764  13143
38  2005     2  55688  13315
39  2005     3  68073  16204
40  2005     4  84867  18918
41  2005     5  95172  20666
42  2005     6  99485  22048
43  2005     7 123420  27278
44  2005     8 139681  30826
45  2005     9 147847  30055
46  2005    10 195068  35029
47  2005    11 209846  37040
48  2005    12 246033  53429
49  2006     1 269237  60558
50  2006     2 254134  67589
51  2006     3 294420  80962
52  2006     4 309672  83824
53  2006     5 353698  93007
54  2006     6 325163  95554
55  2006     7 330718 102532
56  2006     8 364406 116167
57  2006     9 392551 117084
58  2006    10 453260 125561
59  2006    11 484370 136733
60  2006    12 467665 139049
61  2007     1 521937 152841
62  2007     2 522122 156115
63  2007     3 587593 166786
64  2007     4 580644 163302
65  2007     5 583441 157434
66  2007     6 490714 137363
67  2007     7 465612 139333
68  2007     8 466858 138500
69  2007     9 515266 140561
70  2007    10 567971 149586
71  2007    11 529857 142469
72  2007    12 488696 130853
73  2008     1 546358 143924
74  2008     2 539645 143657
75  2008     3 577579 150839
76  2008     4 564249 148672
77  2008     5 560972 144963
78  2008     6 489359 129946
79  2008     7 465218 128503
80  2008     8 468386 128224
81  2008     9 521777 130752
82  2008    10 567766 138341
83  2008    11 517581 132717
84  2008    12 479973 129029
85  2009     1 541440 138752
86  2009     2 529409 134380
87  2009     3 559096 140920
88  2009     4 501052 133901
89  2009     5 499108 136814
90  2009     6 448920 130297
91  2009     7 424177 127757
92  2009     8 423714 127569
93  2009     9 450131 127735
94  2009    10 489478 133193
95  2009    11 474614 132618
96  2009    12 445566 126985
97  2010     1 493342 134999
98  2010     2 474142 130772
99  2010     3 504201 136700
100 2010     4 482889 131225
101 2010     5 482409 131343
102 2010     6 424694 119112
103 2010     7 410266 116362
104 2010     8 416148 117332
105 2010     9 432894 114825
106 2010    10 459869 118246
107 2010    11 443758 116147
108 2010    12 421323 111943
109 2011     1 473608 123933
110 2011     2 400635 124776
111 2011     3 395382 134702
112 2011     4 375495 127779
113 2011     5 381674 126780
114 2011     6 358992 121164
115 2011     7 349445 118365
116 2011     8 360907 120690
117 2011     9 354622 119771
118 2011    10 384521 120891
119 2011    11 379829 117540
120 2011    12 358109 112960
121 2012     1 378354 119845
122 2012     2 366673 116249
123 2012     3 373006 114207
124 2012     4 363345 112349
125 2012     5 365659 114365
126 2012     6 327139 107653
127 2012     7 325462 110068
128 2012     8 325390 110329
129 2012     9 322210 105873
130 2012    10 363654 111257
131 2012    11 354102 110854
132 2012    12 336997 106331
133 2013     1 365335 113771
134 2013     2 334921 107978
Revisions. The sum total revisions are plotted for two agent types (anon = IP editors, user = registered account) over time.
data table
    year month    anon    user
1   2002     1    6582    7033
2   2002     2   34744   11842
3   2002     3    6252   13406
4   2002     4    4987   11410
5   2002     5    6523   10541
6   2002     6    7753   17557
7   2002     7    8908   19840
8   2002     8   13418   37382
9   2002     9   14882   57341
10  2002    10   16122   79889
11  2002    11   14767   37298
12  2002    12   16069   70989
13  2003     1   19052   55645
14  2003     2   13375   51102
15  2003     3   16450   55211
16  2003     4   17405   56673
17  2003     5   18212   73821
18  2003     6   18809   77259
19  2003     7   24081   85021
20  2003     8   30914   95437
21  2003     9   28823   86056
22  2003    10   29654   94137
23  2003    11   31083  130847
24  2003    12   33782  146604
25  2004     1   34719  140528
26  2004     2   51281  190625
27  2004     3   72768  273204
28  2004     4   67572  251235
29  2004     5   71903  261346
30  2004     6  116225  339558
31  2004     7   91111  397102
32  2004     8   92695  415525
33  2004     9  119360  406570
34  2004    10  134469  447909
35  2004    11  174049  575132
36  2004    12  184944  536989
37  2005     1  181220  436042
38  2005     2  175463  427592
39  2005     3  221243  563762
40  2005     4  301930  707551
41  2005     5  333840  730857
42  2005     6  366635  801560
43  2005     7  447675  952863
44  2005     8  486059 1010211
45  2005     9  477582  950145
46  2005    10  637511 1089737
47  2005    11  686073 1161247
48  2005    12  814495 1619254
49  2006     1  888323 1847550
50  2006     2  818141 1795400
51  2006     3  928449 2144413
52  2006     4  962075 2007965
53  2006     5 1131488 2294020
54  2006     6 1039264 2315509
55  2006     7 1048810 2457158
56  2006     8 1136734 2718024
57  2006     9 1177671 2478246
58  2006    10 1356396 2646149
59  2006    11 1426375 2701018
60  2006    12 1337316 2774374
61  2007     1 1506578 3016875
62  2007     2 1487806 2936861
63  2007     3 1658946 3158764
64  2007     4 1615933 3112278
65  2007     5 1625036 3080062
66  2007     6 1349696 2809410
67  2007     7 1318353 2876191
68  2007     8 1290206 2819663
69  2007     9 1381824 2786521
70  2007    10 1517163 2965697
71  2007    11 1378865 2793668
72  2007    12 1252918 2819557
73  2008     1 1425178 2885444
74  2008     2 1406256 2813322
75  2008     3 1500228 3286876
76  2008     4 1471049 2888746
77  2008     5 1452490 2908546
78  2008     6 1276511 2813279
79  2008     7 1248956 2859283
80  2008     8 1263954 2907032
81  2008     9 1358346 3050065
82  2008    10 1471438 3105049
83  2008    11 1311301 2792878
84  2008    12 1218611 2798503
85  2009     1 1375384 2948529
86  2009     2 1343122 2819580
87  2009     3 1430186 2956990
88  2009     4 1290662 2736693
89  2009     5 1296284 2986443
90  2009     6 1193472 2738490
91  2009     7 1136902 2665222
92  2009     8 1134699 2754510
93  2009     9 1174423 2802971
94  2009    10 1263653 2705464
95  2009    11 1212100 2692243
96  2009    12 1122390 3255307
97  2010     1 1261053 2681675
98  2010     2 1209966 2664554
99  2010     3 1296406 2818772
100 2010     4 1262881 2761881
101 2010     5 1237142 2833181
102 2010     6 1069870 2868806
103 2010     7 1076090 2566090
104 2010     8 1090406 2764606
105 2010     9 1107899 2839909
106 2010    10 1172684 2947512
107 2010    11 1104083 2790598
108 2010    12 1055231 2623904
109 2011     1 1200216 2859028
110 2011     2 1004973 2505141
111 2011     3  999807 2573958
112 2011     4  948664 2610759
113 2011     5  956222 2545010
114 2011     6  930435 2624105
115 2011     7  914425 2694038
116 2011     8  937978 2523124
117 2011     9  899383 2553156
118 2011    10  961145 2433413
119 2011    11  950371 2377974
120 2011    12  894670 2448356
121 2012     1  960778 2701747
122 2012     2  932722 2492479
123 2012     3  955241 2567452
124 2012     4  927534 2447083
125 2012     5  937883 2691577
126 2012     6  848042 2401049
127 2012     7  871331 2533698
128 2012     8  889048 2578225
129 2012     9  848759 2313936
130 2012    10  927398 2548884
131 2012    11  876837 2581585
132 2012    12  836142 2535035
133 2013     1  915532 2681223
134 2013     2  828706 3080439
Bytes added. The monthly sum total bytes added (by comparing rev_len) is plotted by agent type. (anon = IP editor, user = registered account)
data table
    year month      anon       user
1   2002     1  0.012385   0.093651
2   2002     2  0.020683   0.049978
3   2002     3  0.005050   0.092637
4   2002     4  0.046084   0.057417
5   2002     5  0.018208   0.039760
6   2002     6  0.026074   0.071759
7   2002     7  0.007572   0.218202
8   2002     8  0.048021   0.411880
9   2002     9  0.027414   0.486984
10  2002    10  0.147202   0.860387
11  2002    11  0.077913   0.845604
12  2002    12  0.394396   1.320455
13  2003     1  0.369062   1.358169
14  2003     2  0.054143   1.400373
15  2003     3  0.213904   0.985592
16  2003     4  0.132437   1.396273
17  2003     5  0.119801   2.886725
18  2003     6  0.106568   1.834031
19  2003     7  0.189788   1.698788
20  2003     8  0.187244   2.204399
21  2003     9  0.357202   2.643709
22  2003    10  0.156790   3.236849
23  2003    11  0.223395   3.860001
24  2003    12  0.248561   4.150468
25  2004     1  0.160151   4.934074
26  2004     2  0.490282   5.966804
27  2004     3  1.397600  12.991416
28  2004     4  0.758825  11.043725
29  2004     5  1.159358   9.826299
30  2004     6  0.542536   6.455750
31  2004     7  0.596292   9.717470
32  2004     8  0.849873  10.910973
33  2004     9  1.509876  20.425140
34  2004    10  2.698999  26.181648
35  2004    11  4.746463  41.595587
36  2004    12  4.476163  43.062163
37  2005     1  4.169102  42.687316
38  2005     2  3.361946  38.749382
39  2005     3  7.567360  51.428240
40  2005     4  9.512605  87.204960
41  2005     5 13.318327  90.570889
42  2005     6  6.191758  82.137981
43  2005     7  8.831113 100.230005
44  2005     8  5.864484 117.568409
45  2005     9  8.806108 156.629851
46  2005    10 10.286618 238.720597
47  2005    11 20.317337 282.320409
48  2005    12 34.105375 355.498817
49  2006     1 16.488371 331.422568
50  2006     2 19.090705 373.008854
51  2006     3 17.316852 459.107116
52  2006     4 16.254520 395.704831
53  2006     5 18.205109 511.785653
54  2006     6 13.053420 374.937615
55  2006     7 11.892284 303.389970
56  2006     8 14.059616 330.142228
57  2006     9 19.489344 506.451462
58  2006    10 23.076140 651.938649
59  2006    11 28.498922 703.127383
60  2006    12 15.829207 522.990732
61  2007     1 23.852155 635.076213
62  2007     2 28.587102 779.584900
63  2007     3 29.347109 880.605501
64  2007     4 27.202114 817.465592
65  2007     5 27.096942 848.425136
66  2007     6 15.840607 494.051245
67  2007     7 19.496036 394.084894
68  2007     8 14.889447 414.784902
69  2007     9 17.370748 671.761197
70  2007    10 24.814531 748.339506
71  2007    11 22.226349 626.129294
72  2007    12 14.996261 504.416459
73  2008     1 16.084718 696.383322
74  2008     2 13.980315 771.939198
75  2008     3 22.934456 763.856775
76  2008     4 23.672930 802.030691
77  2008     5 18.176154 751.843843
78  2008     6 20.582286 485.225110
79  2008     7 19.503239 380.738188
80  2008     8 23.071680 470.297873
81  2008     9 26.271215 766.216586
82  2008    10 24.024057 833.390652
83  2008    11 27.029343 647.087602
84  2008    12 16.861358 532.434540
85  2009     1 18.261037 663.527909
86  2009     2 23.135783 719.732351
87  2009     3 17.717117 577.416147
88  2009     4 10.459358 300.011900
89  2009     5 10.910196 302.408575
90  2009     6 11.401898 217.979580
91  2009     7 10.812957 178.307883
92  2009     8 10.254370 184.748383
93  2009     9 12.036462 270.813471
94  2009    10 11.018157 331.437680
95  2009    11 11.346309 293.060275
96  2009    12 10.055338 220.580848
97  2010     1 10.658743 293.207155
98  2010     2 12.836150 304.313778
99  2010     3 12.913615 346.488763
100 2010     4 11.537811 312.793582
101 2010     5  9.877441 277.463836
102 2010     6  9.138270 201.640865
103 2010     7  7.050153 153.493675
104 2010     8  8.868266 151.021926
105 2010     9  7.703931 234.894474
106 2010    10  8.815313 214.744755
107 2010    11  6.868494 186.537888
108 2010    12  7.817498 153.815932
109 2011     1  7.765005 173.002349
110 2011     2  8.079007 182.464324
111 2011     3  7.418945 160.227254
112 2011     4  8.883535 155.924546
113 2011     5  7.433619 155.547375
114 2011     6  7.161748 120.272709
115 2011     7  7.048998 110.943525
116 2011     8  6.874750 119.627500
117 2011     9  6.703717 143.835701
118 2011    10  7.582507 169.258820
119 2011    11  6.635208 164.122285
120 2011    12  6.296329 143.030151
121 2012     1  6.881303 158.431960
122 2012     2  6.768044 168.105131
123 2012     3  7.392149 171.223545
124 2012     4  7.736228 156.202519
125 2012     5  6.301344 173.848224
126 2012     6  6.819558 123.970909
127 2012     7  7.876956 115.279176
128 2012     8  7.924810 117.215783
129 2012     9  6.992721 148.853215
130 2012    10  9.745043 203.088100
131 2012    11  8.321949 165.472022
132 2012    12  6.580694 149.146924
133 2013     1  5.957408 157.779940
134 2013     2  6.538645 153.345886

Conclusions[edit]

  • The number of anonymous article contributors per month has fallen since 2007, but seems to have stabilized at a bit over 100,000 unique IPs editing in the main namespace per month. Compare this cohort to the 10-12% of English Wikipedia signups that stop mid-edit to create an account (full data)
  • The number of revisions to articles by anonymous editors has been just below one million per month. This composes roughly 1/3 of the total revisions to English Wikipedia.
  • Since 2006, the average revisions per session has hovered at just below two edits a session for anons. There has been much more movement in the average for registered accounts, in part due to bot activity, but it has generally been 5-6 edits per session. In terms of anonymous editor acquisition, this suggests that providing calls to register will not reach most anonymous editors if we wait for a string of multiple edits (more than two).
  • The average time spent per editing session follows a similar pattern as average revisions per session. For anonymous editors, the average has been below 15 minutes per edit session. For registered accounts, including bots, the average is double, at just below 30 minutes.

Pre-registration anonymous activity[edit]

Using the cu_changes table, we generated a dataset that contains a newcomers' pre-registration and first session activity. See the following diagram to get a sense of the kind of activity we're interested in:

Anon editor analysis method.svg

In order to generate this, we gathered a sample of newly-registered users from the cu_changes table using this query:

SQL source code
SET @month_end = (SELECT max(rc_timestamp) FROM recentchanges);
SET @month_start = DATE_FORMAT(@month_end - INTERVAL 30 DAY, "%Y%m%d%H%i%S");

SELECT 
    user_id as id, 
    user_name as name, 
    user_registration as registration, 
    user_editcount as editcount, 
    cuc_ip as registration_ip
FROM user
INNER JOIN cu_changes ON 
    cuc_user = user_id AND
    cuc_type = 3 AND
    cuc_actiontext LIKE "User account % was created"
WHERE user_registration BETWEEN @month_start AND @month_end
AND user_editcount > 0
AND user_id % 10 = 0

Using the registration_ip from this dataset, we scan the revision and archive looking for revisions with a user_text field that corresponds to the registration_ip. Using the edit session method with a 1 hour cutoff, we then gather the last session before registering (if any) and the first session since registering.

Results[edit]

In my sample, 6.9% of newcomers who ended up making at least one edit in their first week edited as an IP before registering their account. I'm generating some histograms now to get a sense for how much they before and after registering.

Histogram of session revisions pre-registrations
Histogram of session revisions pre-registration (with zeros filtered)
Histogram of session revisions post-registration

So how do users who edit before registering behave compared to users who don't? The following plots include data from both the pre-registration session and the post-registration session.

Revisions.geo mean.by pre-registration.svg
Main revisions.geo mean.by pre-registration.svg
Productive prop.geo mean.by pre-registration.svg
Mean revert prop.geo mean.by pre-registration.svg
Data Table
   pre_session revisions.geo_mean revisions.geo_se main_revisions.geo_mean
1:       FALSE           1.723199       0.01047070                1.115439
2:        TRUE           4.281658       0.03594024                3.452403
   main_revisions.geo_se revert_prop.mean revert_prop.sd productive.k    n
1:           0.009144279        0.2321521      0.4122159         2342 4564
2:           0.037101374        0.3916604      0.4287481          232  340
   productive.prop productive.se revert_prop.se
1:       0.5131464   0.007398557    0.006101715
2:       0.6823529   0.025248611    0.023252133

What if we limit the analysis to just the post-registration session?

Revisions.geo mean.by pre-registration.post registration.svg
Main revisions.geo mean.by pre-registration.post registration.svg
Productive prop.by pre-registration.post registration.svg
Mean revert prop.by pre-registration.post registration.svg
Data Table
   pre_session revisions.geo_mean revisions.geo_se main_revisions.geo_mean
1:        TRUE           2.178460       0.04449625                1.728482
2:       FALSE           1.723199       0.01047070                1.115439
   main_revisions.geo_se revert_prop.mean revert_prop.sd productive.k    n
1:           0.038133895        0.3089849      0.4439494          195  340
2:           0.009144279        0.2321521      0.4122159         2342 4564
   productive.prop productive.se revert_prop.se
1:       0.5735294   0.026821492    0.024076534
2:       0.5131464   0.007398557    0.006101715

Conclusions[edit]

  • In our sample, 6.9% of newly-registered users edited anonymously within an hour before signing up.
  • Users who edited anonymously right before signup are significantly more productive newcomers than the general population of editors