Research:Wikihounding and Machine Learning Analysis

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search
00:19, 18 November 2017 (UTC)
Duration:  2017-November — 2018-February

This page documents a planned research project.
Information may be incomplete and change before the project starts.

Participants from the Anti-Harassment Tools team and the Research Team at the Wikimedia Foundation are exploring creating a machine learning model on the harassment phenomena called wikihounding.

We are focusing on English Wikipedia AN/I cases labeled 'wikihounding' to create a labeled training dataset for this model based on community-determined instances of wikihounding. The AN/I archive is not a well structured dataset, but it's accessible for qualitative analysis and open.

Wikihounding is incredibly qualitative and quantitive. All interactions inside of a digital space has a quantitive aspect to it, every comment, revert, etc is a data point. By analyzing data points comparatively inside of wikihounding cases and reading some of the cases, we can create a baseline for what are the actual overlapping similarities inside of wikihounding.

Wikihounding currently has a fairly loose definition. Wikihounding, as defined by the Harassment policy on en:wp, is:

“the singling out of one or more editors, joining discussions on multiple pages or topics they may edit or multiple debates where they contribute, to repeatedly confront or inhibit their work. This is with an apparent aim of creating irritation, annoyance or distress to the other editor. Wikihounding usually involves following the target from place to place on Wikipedia.”

This definition doesn't outline parameters around cases such as frequency of interaction, duration, or minimum reverts. Our research should yield those results, such as what is considered a 'normal' amount of reverts inside of a wikihounding case, how long are wikihounding cases 'generally,' etc. There are not set baselines and parameters around wikihounding and this research project will help establish those parameters while also building a model for machine learning to further study wikihounding inside of Wikipedia.


Wikihounding can look like a lot of things, such as various forms of edit warring, etc. There are a lot of cases that would technically be called hounding but occurred within a larger context of another case, e.g. a user being a sock puppet or engaging in more rampant harassment or bad behavior. Ultimately those cases are labeled as something else, not hounding. To differentiate between what is hounding versus what is not.

We plan to start by looking at AN/I labeled and archived wikihounding cases.

  1. Examine a small number of reported instances of Wikihounding to identify similarities and patterns, and validate our existing assumptions about the nature of Wikihounding.
  2. Label a larger number of AN/I cases for accuser, accused, date of report, and outcome of AN/I case.
  3. Generate descriptive statistics about the identified instances of hounding (e.g. average duration of interaction and number of reverts)
  4. Train a learning model on the interactions (co-located edits, reverts, and talkpage posts) between the accusing and accused editors in the weeks prior to the date that hounding was reported.
  5. Run the model over other sets of editor interactions that look superficially similar to the training set, and manually audit the resulting cases that the model determines are likely to be instances of wikihounding. Are there patterns? If so, what are they? Are there false positives?
  6. Looking at time, frequency, location
    • then: contextual aspects of hounding: so incivility, antagonizing content, toxicity
    • labeling these terms: incivility, antagonizing content, toxicity, will need ables and examples

Open questions and concerns: What about recall? What is adjacent wikihounding? Are there things that are similar to wikihounding but not quite hounding? What are the false positives, false observations (i.e. reverse wikihounding)?


To be determined.


With the results from the above analysis, we would then run the model over various cases, and ask for feedback from the community in labeling the resulting cases to keep training the model.

See also[edit]