Research:Measuring overall contribution of editors

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search
Fabian Kaelin
Diederik van Liere
Duration:  2011-06 – 2011-06
Open access project  Open access
no url provided
Open data project  Open data
no url provided
This page documents a completed research project.


The goal of this sprint is to define a new metric for measuring which editors add the most content to Wikipedia. So far, the main ways of determining the contribution of an individual editor are edit count, number of articles created, and whether articles have passed through assessment processes such as GA and FA. In this sprint we aim to measure the overall contribution using the text added to pages (in kilobytes) by editors, in order to better identify and recognize those Wikipedians who are active authors of the encyclopedia.

The main challenge will likely be filtering out the noise in the revision data (template additions, bots, page moves, script-assisted editing). It would be great if we could successfully separate this noise, as the measure could then be used as an alternative way to objectively determine the contributions of editors.


First, we will create a list of top contributors on Wikipedia by year and month. Depending on how cleanly we can separate the noise, we can then proceed to investigate how the distribution of contributions has changed over time. i.e.

  • How does the life cycle of an editor look in terms of kb contribution? Does he contribute more at the beginning or towards the end?
  • Has the group of editors that have contributed most of the content become smaller over the years?
  • Have the dynamics of the top contributors changed over time?

Please add any interesting suggestions you might have.

Results and discussion[edit]

Future work[edit]