Research talk:Exploring systematic bias in ORES/Work log/2019-05-03

Friday, May 3, 2019[edit]

Investigating encoded bias against content[edit]

This week I developed an approach based on Wikidata for identifying edits to articles by the gender or location of the subject.

In contrast to last week's striking findings on bias against newcomer and anonymous editors, I don't see any clear evidence of bias along the lines of content I considered. This is somewhat surprising given how easy it can seem for machine learning classifiers to pick up on biases in human lables. But on the other hand, while it's fairly obvious to anyone with experience with Wikipedia that newcomers and anonymous editors are more likely to make damaging edits compared to experienced registered editors, I wouldn't expect this to be nearly as true when it comes articles on women or on places in the global south. Even if many Wikipedians or page patrollers may have biases against these categories of content, they may not manifest in judgements of specific edits, and even if they do, if the particular Wikipedians who provided the labels to train ORES were not so biased, then we would not expect ORES to learn such biases.

On the rest of this page I'll document the strategy I used to approximately identify the gender and geographic region of article contents, present plots that show no evidence of bias along these lines, and reflect on the implications of this finding for the latter stages of the project.

Using Wikidata on the analytics cluster[edit]

The first step of this process was to find Wikidata entities for the articles edited in the ORES labels dataset.

There isn't any official mapping between Wikipedia articles and Wikidata in the analytics cluster (see the phabricator task I opened), but (User:JAllemandou_(WMF) gave me some great help by sharing the tables he made, which made it possible to find Wikidata properties for 147,187 of the 133,313 article edits (90%) that were labeled by humans in training ORES models.

To identify articles on women, I used the wikidata properties for sex or gender and instance of. I labeled articles as being about men or women if they are instances of human. I found one instance of sex or gender having the value "Lesbo," which was an instance of vandalism which has since been corrected and I mapped "transgender male" and "transgender female" values to "male" and "female" respectively. I labeled articles on humans that didn't have a sex or gender field as "unknown."

To identify articles on places in global north or global south I used the coordinate property, which provides latitudes, longitudes. Andrew Hall gave me the good tip that not all entities with this property are on Earth, and I restricted this analysis to ones that are (we aren't interested in bias against the Martian global south :)). I used the reverse_geocode python package to map coordinates to countries and the canonical_data.countries table to map from countries to Global North / Global South economic regions.

Results[edit]

Gender[edit]

The following slideshow presents calibration and balance plots similar to those I made about bias against newcomer and anonymous editors, but I interpret as evidence against systematic encoded bias against edits to articles on women.

Global geographic region[edit]

Similarly, I interpret the plots below as evidence against systematic encoded bias against edits to articles on places in the global south.

Discussion[edit]

The broader goal in this project is not to characterize the bias encoded into ORES, but to understand how ORES changes quality control processes on Wikipedia and how these changes might relate to algorithmic biases. Below is a conceptual diagam that I use to think about what might happen:

Conceptual diagram of how (in theory) introducing an algorithmic scoring system into a rational-bureaucratic sociotechnical system can change bias in the system. This diagram describes the theory in concrete terms as instantiated in ORES and Wikipedia.

I observed considerable bias against newcomer and anonymous editors encoded into ORES, but I still think that bringing new informative signals to watch lists and to en:Special:RecentChanges could plausibly expand participation in monitoring in ways that reduce systematic unfair reversions of contributiuons of such editors. That isn't to say that I don't think that algorithmic bias will not matter. In fact I think that it will, so much that I thin kthere's a quite a lot of room to be skeptical of my prediction for hypothesis 2, simply because the arrow for Hypothesis 4 might be stronger than the arrow for hypothesis 2.

But since we didn't find encoded bias against edits to articles on women or on places in the global south, I think we have stronger reason to believe H2 for these classes of edits and to believe H4 in the case of newcomers and anons. Finding H2 for gender and geographic region but H4 for newcomers and anons would be consistent with algorithmic bias having strong effects on systemic bias in monitoring even as the algorithm decreases systemic bias in other areas by improving the pool of monitors.

Using Wikidata to classify the subject of articles[edit]

I was able to find 90% coverage in Wikidata for this sample of edits. This seems pretty decent to me, but also leaves a lot of room for improvement. I didn't systemtically check the accuracy of these labels, but for now I feel okay assuming that it is not bad.

Going forward I'm looking for avenues to improving this approach, perhaps by using humans (myself, crowdworkers, or maybe Wikipedians) to check Wikidata by labeling a sample of articles into these categories. At the very least, it would be great to quantify the precision and recall of Wikidata for these categories of articles, which would allow us to understand whether it could be driving results.