Jump to content

Research:Disparities in Online Rule Enforcement

From Meta, a Wikimedia project coordination wiki
Created
15:48, 16 June 2023 (UTC)
Collaborators
Manoel Horta Ribeiro
Duration:  2023-09 – 2025-12
governance, gender, rule enforcement, disparate impacts, content gaps

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.


Because Wikipedia is read so widely and its text used to train language models, understanding mechanisms driving contributor and content gaps is a critical concern with deep social and technological implications. Accounts from Wikipedians and academic researchers have long suggested (particularly with qualitative evidence) that Wikipedia's governance structure may play a role in the persistence of these gaps. For example, contestation about the Notability rule (Adams et al, 2019) has been central to discussions about gender gaps. Likewise, scholars note that Wikipedia's rules are embedded with a specific epistemology about objectivity and neutrality that necessarily excludes other ways of knowing and storing knowledge (Menking & Rosenberg 2020).

The present study focuses on the impact of rules. Empirical work has suggested that rules are disproportionately applied to certain content (Tripodi 2021), as well as observed recent successes in countering gaps that have been associated with these practices (Lanrock et al 2022). Where prior work focuses on the fact of enforcement, a key question is what the actual effect of enforcement is on how a content article subsequently develops. That is, there is an important distinction in showing that rules are not only disparately enforced but also disparately consequential in the production of certain content compared to others (differential treatment versus effect). A core goal of this work is to clarify the impact of rule enforcement in content gaps to better inform practical strategies for closing content gaps on Wikipedia.

Methods

[edit]

We use statistical methods to assess the effects of rule enforcement on subsequent contributions to content on English Wikipedia, asking if effects differ for categories of content in ways that indicate social bias. In particular, we will test for statistically significant differences (due to rule enforcement) in content and contribution patterns on articles. In focusing on the effects on subsequent edits to content, our goal is to understand how rules shape how contribution efforts become allocated across content. In this sense, we are not focusing on coverage, topical content, linguistic, or contributor gaps as conceptualized by prior work but a key intermediate dynamic that interacts with and shapes those gaps.

We take a computational approach, leveraging the digital trace data made available by the Wikipedia API endpoints. Our analyses are scoped as a series of more defined case studies of specific rule enforcement mechanisms such as Articles for Deletion.

Timeline

[edit]

We are currently actively working on this project, currently in data cleaning phase (as of August 2025).

Policy, Ethics and Human Subjects Research

[edit]

As we rely on data shared in Wikimedia data dumps or available through the Wikimedia API, this project does not impact Wikipedian's work. If we determine a follow-up study involving interacting with Wikipedians is necessary, we will submit an IRB and update this section here. In either case, we will share the results of our work with the community, and we are excited to be in conversation about how the findings can benefit the community. In particular, we are interested in later developing tools that might help Wikipedians audit governance practices. One of the investigators (Sohyeon) is involved in a dataset project that could be core to such a tool.

Results

[edit]

Forthcoming!

Resources

[edit]

Replication code and notebooks will be released once we have them.

References

[edit]