Research:Modeling monthly active editors

From Meta, a Wikimedia project coordination wiki
This page documents a completed research project.
A conceptual tree view of the month active editors model is presented.
Monthly active editor model (concept). A conceptual tree view of the month active editors model is presented.

Terms[edit]

Editor Classes[edit]

Proportions of active editor classes in Monthly Active Editors is plotted with an equation showing how they add up. These values are based on averages between May 2013 and May 2014 in English Wikipedia.
MAE proportions (enwiki). Proportions of active editor classes in Monthly Active Editors is plotted with an equation showing how they add up. These values are based on averages between May 2013 and May 2014 in English Wikipedia.
Monthly Active Editors (MAE)
Editors who save at least 5 revisions within one month.
New Active Editors (NAE)
Newly registered users who save at least 5 revisions in the month that they registered.
Surviving New Active Editors (SNAE)
New Active Editors from the previous month who continued to make at least 5 edits in the current month.
Recurring Old Active Editors (ROAE)
Non-new Active Editors from the previous month who continued to make at least 5 edits in the current month.
Reactivated Editors (RAE)
All other active editors who (1) were not active in the previous month and (2) were not a Newly registered user in the current month.
Basic equation

Rates[edit]

New Editor Activation Rate (NEAR)
The proportion of Newly registered users who save at least 5 revisions in the current month
New Active Survival Rate (NASR)
The proportion of New Active Editors the previous month who save at least 5 revisions in the current month
Old Active Survival Rate (OASR)
The proportion of Active Editors from the previous month (who were not Newly registered users) who save at least 5 revisions in the current month
Expanded equation with rates


Analysis[edit]

English Wikipedia[edit]

The count of monthly active editors is plotted (stacked) for the four active editor classes for the English Wikipedia. A loess curve is fit to the trend.
MAE over time. The count of monthly active editors is plotted (stacked) for the four active editor classes for the English Wikipedia. A loess curve is fit to the trend.
Activation and retention rates are plotted for editor class thresholds for the English Wikipedia.
MAE rates over time. Activation and retention rates are plotted for editor class thresholds for the English Wikipedia.

Italian Wikipedia[edit]

The count of monthly active editors is plotted (stacked) for the four active editor classes for the Italian Wikipedia. A loess curve is fit to the trend.
MAE over time. The count of monthly active editors is plotted (stacked) for the four active editor classes for the Italian Wikipedia. A loess curve is fit to the trend.
Activation and retention rates are plotted for editor class thresholds for the Italian Wikipedia.
MAE rates over time. Activation and retention rates are plotted for editor class thresholds for the Italian Wikipedia.

Comparison with legacy definition[edit]

In order to explore the implications of the relatively simplistic definition of active editor used in this model (5 revisions to any page), we compare counts of active editors with those generated by the historical definition (5 revisions to countable pages).

  • dump ns=0 -- Legacy definition based on XML dump processing within wikistats. Does not include edits to deleted pages and also filters by countable pages
  • archive ns=0 -- All edits to pages in the article namespace (ns=0) count.
  • archive ns=all -- All edits to any page count. This is the definition used in the analysis above.
Raw counts
MAE comparison (enwiki). 
MAE comparison (itwiki). 
Factor comparison
MAE comparison (dump ns=0). 
MAE comparison (archive ns=0). 
MAE comparison (archive ns=all). 

See also[edit]