Research:Productive new editor

From Meta, a Wikimedia project coordination wiki
(Redirected from Research:Productive editor)
Jump to navigation Jump to search
Standard metric
Productive new editor
Wiki metrics.productive new editor.svg
Specification
A is a new editor who completes at least productive edit(s) within time since registration ().
WMF Standard
Status
completed Icon 100 percent.png

Productive new editor is a standardized user class used to measure the number of first-time editors in a wiki project over time who make productive contributions. It's used as a proxy for editor productivity, and to a lesser extent, editor activation. A "productive new editor" is a new editor who saves revisions to content namespace pages that are not reverted.

Discussion[edit]

Excluding edits to deleted content[edit]

Spammers and other non-productive new editors tend to create articles that are non-productive and those articles tend to be deleted rather than the edits to the articles being reverted (and therefore excluding them from the productive edit criteria). Edits to articles that are deleted by the end of a new editor's first week since registration are not included in counts of productive edits.

The n productive edits threshold[edit]

Like choosing an for any metric based on counts (e.g. new editor and active editor), choosing a threshold is somewhat arbitrary. Choosing a higher threshold will result in a smaller proportion of newly registered users being considered productive.

The t time cutoff[edit]

There are a few ways that the timespan for identifying productive edits can be drawn. The two most common ways are based on time bounds and events. A time-bounded approach is based on the use of some cutoff to limit observations to a certain amount of time after a user registered their account. An event-based approach will use some event as the starting point to count user contributions. Another candidate time-span includes edits that a newcomer performed in their first edit session. Since productive new editor qualifies the activity of a new editor we set , which effectively makes the class of productive new editors a proper subset of new editors. We analyze the effect of choosing a different value for below.

Time to revert cutoff[edit]

Because a revert can theoretically occur years after the original edit, r en:Censoring_(statistics) everts are only counted if they occurred within 48 hours of the original edit. For more details, see Research:Revert#Time to revert cutoff.

Limitations[edit]

  • This metric represents productivity as a binary attribute of a user, it does not measure how productive a new editor is. New editors who make many productive edits and contribute substantial amounts of content will look identical (under this metric) to new editors who fix a few typos.
  • The most clever vandalism/vandals may go unnoticed for more than 48 hours.

Lag time[edit]

Generation of this metric will need to be delayed by after users' registration dates in order to allow days for newly registered users to make edits and an additional for other editors to have a chance to revert them. In the case of the WMF Standard parameterization, this works out to .

Analysis[edit]

Besides the variables describing a productive edit, there are two variables used in this metric:

  • The value of
  • The value of

Given that the raw number of productive new editors is highly dependent on the raw number of new editors, this metric is best examined as a proportion. Given that identifying reverts is computationally difficult, the following plots were generated by randomly sampling newly registered users stratified by registration month.

German Wikipedia[edit]

The proportion of productive new editors is plotted by registration month for two values of '"`UNIQ--postMath-00000011-QINU`"'.
Productive new editors per new editorThe proportion of productive new editors is plotted by registration month for two values of .
The proportion of productive new editors is plotted by registration month for two values of '"`UNIQ--postMath-00000012-QINU`"'.
Productive new editors per newly registered userThe proportion of productive new editors is plotted by registration month for two values of .

English Wikipedia[edit]

The proportion of productive new editors is plotted by registration month for two values of '"`UNIQ--postMath-00000013-QINU`"'.
Productive new editors per new editorThe proportion of productive new editors is plotted by registration month for two values of .
The proportion of productive new editors is plotted by registration month for two values of '"`UNIQ--postMath-00000014-QINU`"'.
Productive new editors per newly registered userThe proportion of productive new editors is plotted by registration month for two values of .

Spanish Wikipedia[edit]

The proportion of productive new editors is plotted by registration month for two values of '"`UNIQ--postMath-00000015-QINU`"'.
Productive new editors per new editorThe proportion of productive new editors is plotted by registration month for two values of .
The proportion of productive new editors is plotted by registration month for two values of '"`UNIQ--postMath-00000016-QINU`"'.
Productive new editors per newly registered userThe proportion of productive new editors is plotted by registration month for two values of .


French Wikipedia[edit]

The proportion of productive new editors is plotted by registration month for two values of '"`UNIQ--postMath-00000017-QINU`"'.
Productive new editors per new editorThe proportion of productive new editors is plotted by registration month for two values of .
The proportion of productive new editors is plotted by registration month for two values of '"`UNIQ--postMath-00000018-QINU`"'.
Productive new editors per newly registered userThe proportion of productive new editors is plotted by registration month for two values of .

Polish Wikipedia[edit]

The proportion of productive new editors is plotted by registration month for two values of '"`UNIQ--postMath-00000019-QINU`"'.
Productive new editors per new editorThe proportion of productive new editors is plotted by registration month for two values of .
The proportion of productive new editors is plotted by registration month for two values of '"`UNIQ--postMath-0000001A-QINU`"'.
Productive new editors per newly registered userThe proportion of productive new editors is plotted by registration month for two values of .

Portuguese Wikipedia[edit]

The proportion of productive new editors is plotted by registration month for two values of '"`UNIQ--postMath-0000001B-QINU`"'.
Productive new editors per new editorThe proportion of productive new editors is plotted by registration month for two values of .
The proportion of productive new editors is plotted by registration month for two values of '"`UNIQ--postMath-0000001C-QINU`"'.
Productive new editors per newly registered userThe proportion of productive new editors is plotted by registration month for two values of .

Factor comparison of n and t[edit]

The factor of difference between proportions of productive new editors for different values of '"`UNIQ--postMath-0000001D-QINU`"' is plotted.
Factor of nThe factor of difference between proportions of productive new editors for different values of is plotted.
The factor of difference between proportions of productive new editors for different values of '"`UNIQ--postMath-0000001E-QINU`"' is plotted.
Factor of tThe factor of difference between proportions of productive new editors for different values of is plotted.

Usage[edit]

User interface experiments at the Wikimedia Foundation apply the productive new editor metric across an entire cohort to generate the productive newcomer proportion. For example, in an A/B test, we might compare the proportion of productive new editors in a control group to a test group. This allows us to have a very basic understanding of whether the A/B test led to more new editors making productive contributions.

See the following research reports for examples of this type of usage:

References[edit]