Research:Lag between registration and first edit

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search
Walnut.svg
This page in a nutshell: An overwhelming majority of new users (80%) make their first edit within a day of registering, and 75% make it within one hour.
Created
2011/07
Duration:  2011-07 — 2011-07
Contact: not applicable
Open access project  Open access
no url provided
Open data project  Open data
no url provided
GearRotate.svg

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.

This sprint investigates the research question: How long does it take for new users to make an edit once they register an account?

Process[edit]

Data for a registered user's first edit ever -- which includes live and deleted edits -- was generated. This was then compared to the user's registration date. Note: because of legacy installations of MediaWiki, user registration data may be inaccurate prior to 2005. At that time, the software would sometimes record the date of a user's first edit as their registration date. However, this makes up a small percent of users given the massive growth in registration and editors in 2006-7.

The data for all users were then fitted to a Gaussian mixture model, a clustering technique that is able to separate lag observations in several classes (or components). We tried fitting a mixture of N=2,3, and 4 components. Estimation of the parameters of the model is performed via the Expectation Maximization algorithm (EM). The data are first transformed in logarithmic scale (base 10). If data are log-normally distributed, then we should see that the logarithm is distributed according to the normal distribution.

Results[edit]

What percentage of registered users edit?[edit]

Enwiki-registration-edit-counts-daily.png

Pie Charts[edit]

Reg edit diff days.png Reg edit diff hours.png Reg edit diff mins.png

Histogram with model fit[edit]

Gaussian Mixture Model fit

Mean (days) Median (days) Std. Dev. (days) Prob.
741.5 18.36 2.993e+04 0.2926
0.008591 0.004197 0.01534 0.7074

Data[edit]

Days between reg and first edit Number of users Percent of all users
0 3477450 80.867%
1 146917 3.417%
2 48885 1.137%
3 33918 0.789%
4 28088 0.653%
5 to 10 111996 2.604%
11 to 20 94112 2.189%
21 to 31 59312 1.379%
31 to 60 73512 1.710%
61 to 180 130443 3.033%
180 to 365 95563 2.222%
Total < 1 year 4300196 100.000%
Hours between reg and first edit Number of users Percent of all users Percent of < 1 day users
0 3257914 75.762% 93.687%
1 111753 2.599% 3.214%
2 35798 0.832% 1.029%
3 18451 0.429% 0.531%
4 11214 0.261% 0.322%
5 7382 0.172% 0.212%
6 4881 0.114% 0.140%
7 3518 0.082% 0.101%
8 2631 0.061% 0.076%
9 2451 0.057% 0.070%
10 2278 0.053% 0.066%
11 2255 0.052% 0.065%
12 2200 0.051% 0.063%
13 2068 0.048% 0.059%
14 1972 0.046% 0.057%
15 1864 0.043% 0.054%
16 1777 0.041% 0.051%
17 1561 0.036% 0.045%
18 1503 0.035% 0.043%
19 1307 0.030% 0.038%
20 1061 0.025% 0.031%
21 854 0.020% 0.025%
22 562 0.013% 0.016%
23 195 0.005% 0.006%
Total < 1 day 3477450 80.867% 100.000%
Minutes between reg and first edit number of users Percent of all users Percent of < 1 hour users
0 293625 6.828% 9.013%
1 387565 9.013% 11.896%
2 360452 8.382% 11.064%
3 290431 6.754% 8.915%
4 232709 5.412% 7.143%
5 190312 4.426% 5.842%
6 to 10 588981 13.697% 18.078%
11 to 20 484058 11.257% 14.858%
21 to 30 207025 4.814% 6.355%
31 to 40 111793 2.600% 3.431%
41 to 50 68942 1.603% 2.116%
51 to 60 42021 0.977% 1.290%
Total < 1 hour 3257914 75.762% 100.000%

Future work[edit]

Separate out this data by registration cohort: has this changed over time?