Jump to content

Research:Wikimedia Summer of Research 2011/Deletion notifications to new users

From Meta, a Wikimedia project coordination wiki
This page documents a completed research project.

This sprint is a continuation of the larger question asked in last week's sprint on alternative lifecycles of new users: Broadly, how are new users introduced into the Wikipedian community, and has this changed over time? Based on the results of that study, we found that new users who received talk page messages were receiving substantially more deletion notifications beginning in 2006 (~2% in 2004/5 vs. 30% in 2008). Yet even as the number of deletion notifications skyrocketed, the level of participation in deletion processes -- that is, responding to the claim that one's article or image does not belong on Wikipedia -- showed a steady decline. However, last week's project only analyzed a small subset of years, and was focused on broadly characterizing new user participation in all community spaces. This study explores issues specific to users who receive deletion notifications, asking how deletion notifications effect subsequent participation in community spaces.


Broadly, what is the effect of notifying a user that an article or file they created has been or may soon be deleted, and has this changed over time?

More specifically, out of all (via a random sample of) users who have been left at least one talk page message within 30 days of making their first edit:

  • How many users receive deletion notifications on their talk page?
    • Are these notifications AfDs, speedy deletions, proposed deletions, or image/file deletions?
    • Are these notifications templated, personalized, or both?
    • Are these notifications the first message the user has received?
  • Once receiving a deletion notification, does the user subsequently participate in any kind of community or talk space?
    • Is this participation in established deletion processes, community spaces such the village pump, Q/A spaces such as the help desk, article, or user talk pages?
  • Does the first message received (templated vs. personalized, deletion notification vs. welcome) correlate with whether a user is still editing one, three, six, and twelve months later?


The unit of analysis for this study is the individual user, 30 days after they make their first edit. A random sample was generated containing 200 new editors to en.wiki per each 6-month period between January 2004 and June 2011, for a total of 3,000 editors. Only registered editors who had at least one message left for them on their talk page were included in this sample, and anonymous/IP accounts were excluded. Based on fields in the MediaWiki database, users were automatically coded to determine, after 30 days, if they were blocked, had created a user page, and had edited in each of the namespaces.

Researchers then manually coded each of the new users based on the schema below, relying on the list of messages left for them as well as their contribution histories. Researchers only specifically coded whether or not the new user had been notified of and/or participated in the various community processes within their first 30 days, as well as the first process the user was notified of and/or participated in.

Coding schema[edit]

Editor characteristics[edit]

Note: These are automatically generated from database tables

  • Blocked
  • Has a userpage
  • Edited various namespaces

First message received[edit]

  • Is a deletion notification (if so, what type)
  • Is a template
  • Is personalized
  • Is a warning
  • Is a welcome

Participation in community activities and processes[edit]

  • Has the user participated in the following (see list of community activities and processes):
  • Was this participation before or after the first message received?
  • If a deletion notification was received, was this participation before or after the notification?

List of community activities and processes[edit]

Unlike the previous study on alternative lifecycles of new users, these variables will only be coded in their higher level category—except for deletions, the more specific venue of participation will not be recorded.

Results and discussion[edit]


  • Years are presented in semi-annual bins -- e.g., 2004-2 is July to December 2004, 2009-1 is January to June 2009, etc.
  • Semi-annual bins reflect the date a user made their first edit
  • Whether or not a user was retained is defined by whether they made an edit between two and six months after their first edit
  • Bots and banned users are excluded from this sample
  • The sample of users were those who made at least one edit and had at least one edit made to their talk page within 30 days of their first edit. This included users who made only one edit and did so to their own user talk page. Those users -- approximately 10% of new users in our sample -- are not excluded from this data, but they are noted in the table below.

Core findings[edit]

  • Templated messages have long been the first message to newcomers, with only a small amount of growth, from 80% in 2004 to 90% in 2010-2.
  • In 2004-2, about 80% of new users were first sent a welcome message; after a trough in 2007-2 with around a 25% welcome rate, 2009-1 and 2010-2 have seen this number stabilize to around 45%.
  • Warnings have experienced the inverse phenomenon: fewer than 10% of new users were sent a warning as their first message in 2004-2, while this rate has skyrocketed to between 55% and 65% in 2007-10.
    • This is aided by the increasing use of the 'welcome-warning' -- templates which combine both a standard welcome template with a warning template, or semi-automated programs which leave a welcome and a warning message in rapid succession.
  • About 15% to 20% of all new users who receive talk page messages in their first 30 days receive a deletion notification of some kind.
    • This figure is relatively stable between 2007 and 2010, particularly if image deletion notifications are distinguished from article notifications, due to a substantial amount of bot-based administration of images in the second half of 2009.
    • The overwhelming majority (70% to 85% based on time period) of users who were sent deletion notifications in their first 30 days received such notifications as their first message from another user.
  • Receiving a warning in a user's first 30 days correlates with whether that user is still editing 6 months later.

Messages to new users[edit]

User retention[edit]

Receiving a warning in a user's first 30 days does correlate with whether a user is still editing 2-6 months after their first edit. The variation in retained users over time demonstrates statistically insignificant variation. In these statistics, receiving an article deletion notice in their first 30 days does not predict whether a user is still editing 2-6 months after their first edit. In fact, in all time periods after 2004-2, a larger proportion of retained users received deletion notifications compared to non-retained users. This may be explained by the fact that users who create articles at all are in a different typology of newcomers and may be more likely to continue editing than those who do not create articles -- regardless of whether the articles created are nominated for deletion. Further study can be done to filter between and subsequently compare retention metrics for users who created a new article and received a deletion notice versus users who created a new article and did not receive a deletion notice.

time period % users not retained % users retained
2007-2 87.83% 12.17%
2009-1 90.43% 9.57%
2010-2 91.88% 8.13%
total 89.94% 10.06%

After deletion notifications[edit]

users retained users not retained
time period notified of deletion not notified notified of deletion not notified
2004-2 0.00% 100.00% 4.00% 96.00%
2007-2 34.78% 65.22% 33.13% 66.87%
2009-1 27.78% 72.22% 24.12% 75.88%
2010-2 30.77% 69.23% 19.05% 80.95%

After warnings[edit]

retained not retained
warned not warned warned not warned
2004-2 2.82% 97.18% 9.60% 90.40%
2007-2 39.13% 60.87% 59.04% 40.96%
2009-1 27.78% 72.22% 59.41% 40.59%
2010-2 53.85% 46.15% 65.99% 34.01%

warned not warned
time period retained not retained retained not retained
2004-2 14.29% 85.71% 37.91% 62.09%
2007-2 8.41% 91.59% 17.07% 82.93%
2009-1 4.72% 95.28% 15.85% 84.15%
2010-2 6.73% 93.27% 10.71% 89.29%


Future work[edit]

Next week, we will perform this analysis for all years within all time segments.