Jump to content

Research:Disparities in Online Rule Enforcement

From Meta, a Wikimedia project coordination wiki
Created
15:48, 16 June 2023 (UTC)
Duration:  2022-09 – 2024-10
governance, gender, rule enforcement, disparate impacts, content gaps

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.


Because Wikipedia is read so widely and its text used to train language models, understanding gaps and biases in content is a top concern with deep social and technological implications. One critical area of content gaps that researchers and Wikipedians have long reported evidence of is that of gender gaps, with accounts also heavily suggesting that the rules and principles of Wikipedia play an important role in these continued gaps (e.g., rules applied at different rates and in different ways across content about men vs women). However, a critical question that remains is whether the rules, when enforced, have disparate or incommensurate effects on the subsequent development of content across group identity categories. Disentangling the enforcement of a rule from its actual effect is critical in understanding the role of governance structures in patterns of marginalization online. That is, there is an important distinction in showing that rules are not only disparately enforced but also disparately consequential in the production of certain content compared to others (differential treatment versus effect).

This present study aims to evaluate the disparate impacts in the enforcement of rules on subsequent contributions to content in online spaces. A rule with disparate impacts is one that “that seems neutral but has a negative impact on a specific protected class of persons”, usually referring to an unintended discriminatory effect of a decision-making mechanism [Cornell Legal Information Institute]. A core goal of this work is to clarify the impact of rule enforcement in content gaps to better inform practical strategies for closing content and gender gaps on Wikipedia.

Specifically, we use statistical methods to empirically assess the effects of decentralized rule enforcement on subsequent contributions to content in the peer production community of English Wikipedia, asking if the effects are different for content in ways that indicate social bias. In focusing on the effects on subsequent edits to content, our goal is to understand how rules play a role in how contribution efforts are allocated across content. In this sense, we are not focusing on coverage, topical content, linguistic, or contributor gaps as conceptualized by prior work but an intermediate dynamic that interacts with and shapes those gaps.

Methods

[edit]

We take a computational approach to quantitatively analyze disparities in online rule enforcement. In particular, we will test for statistically significant differences (due to rule enforcement) in content and contribution patterns on articles about individuals, which help us surface social biases.

In the first study of this project, we are evaluating the relationship between rule enforcement and content/contribution outcomes of articles that are being tracked by Wikiprojects working to counter systemic bias about gender (e.g., WikiProject Women, WikiProject LGBT). We specifically consider the Articles for Deletion (AfD) process, taking AfDs as a type of rule enforcement that very clearly mediates content on Wikipedia. Essentially, we compare outcomes of articles that have been through AfD and those that have not.

We aim to expand the research design to more broadly evaluate differences in the effect of rule enforcement on content about individuals across different social categories (i.e., disparate impacts).

Timeline

[edit]

We are currently active working on this project. We anticipate being done with data analysis in Spring 2024 and sharing findings of our work in Summer and Fall 2024.

Policy, Ethics and Human Subjects Research

[edit]

As we rely on data shared in Wikimedia data dumps or available through the Wikimedia API, this project does not impact Wikipedian's work. If we determine a follow-up study involving interacting with Wikipedians is necessary, we will submit an IRB and update this section here. In either case, we will share the results of our work with the community, and we are excited to be in conversation about how the findings can benefit the community.

Results

[edit]

Forthcoming!

Resources

[edit]

We will be sharing findings of our work by fall 2024. This includes via the CDSC blog in addition to submitting to conferences, giving presentations and talks, and sharing online with the wiki community.

References

[edit]