Research:Blocks on the English-language Wikipedia

From Meta, a Wikimedia project coordination wiki
This page documents a completed research project.


Key Personnel[edit]

Project Summary[edit]

A vast amount of research has taken place exploring how Wikipedia defends itself against ill-intentioned users. This has mostly explored vandalism - bad faith edits made by those users - and the most direct consequence of those edits (reversion). The objective of this research project is to explore the ultimate consequence of poor intentions, namely blocking. After investigating trends in the overall block rate, we take the block logs, from 2006 to the present, and use a series of regular expressions to categorise blocks into one of six categories:

  1. Spam: blocks for using Wikipedia for advertising purposes;
  2. Disruption: blocks for BLP violations, defamation, personal attacks, threats (legal or otherwise), copyright violations, edit warring and POV-pushing, broadly construed;
  3. Sockpuppetry: the use of multiple accounts in violation of Wikipedia's policies, or long-term, multiple-account abuse of Wikipedia;
  4. Username blocks: blocks for violating Wikipedia's username policies;
  5. Proxy usage: the blocking of proxies.
  6. Misc: blocks for reasons not identified by the regular expressions.

The resulting data is then examined and compared with potential confounds with the block rate (such as registration rates or AbuseFilter hits) in an attempt to answer three core research questions:

  1. Has there been any noticeable shift in the types of actions that users are blocked for?
  2. Has there been any noticeable shift overall?
  3. If either is true: why?

Results[edit]

Shifts in the rate and type of user blocks[edit]

The first task is to investigate whether there have been any shifts in the rate and type of user blocks. With the knowledge that the actions of one group inevitably impacts the other, the dataset was split into two groups prior to analysis - one consisting of blocks of anonymous users, and one consisting of blocks of registered users. In both cases, data was gathered primarily from the logging table, and consists of all block actions between January 2006 and September 2013, excluding unblocks and the modification of existing blocks.

Overall shifts[edit]

Proportionate shifts[edit]

Exploring declines[edit]

Sudden decline (2008-2009)[edit]

Constant decline (2009-2013)[edit]

References[edit]

External links[edit]

Conclusion[edit]