Research talk:Exploring systematic bias in ORES/Work log/2019-05-19

From Meta, a Wikimedia project coordination wiki

Sunday, May 19, 2019[edit]

"Counterfactual unfairness" is reasonable notion of fairness that we can operationalize without ground-truth about "correct" decisions.[1] When applied to reversion on Wikipedia, this notion of fainess asks: "Would this edit have been reverted if the editor were not a newcomer?" This intuitive question can be formalized in terms of the probability an edit is reverted (), conditional on the class of editor () and other factors relevant to the edit, which are understood as appropriate and ethical reasons to revert an edit (). The difference in probability conditional on the class of editor measures counterfactual unfairness.

We are interested in how introducing ORES-powered RCFilters effects unfairness on Wikipedia, which is written:

We can estimate this difference using logistic regression models for each wiki predicting whether an edit was reverted as a function of .

and the average causal effect on the probability a newcomer is reverted will be

We can put this into our DID frameworks by including treatment status, predictors of treatment, and IP weighting in the model.

  1. Kusner, Matt J; Loftus, Joshua; Russell, Chris; Silva, Ricardo (2017). "Counterfactual Fairness" (PDF). In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (eds.). Advances in Neural Information Processing Systems 30. Curran Associates, Inc. pp. 4066–4076. Retrieved 2019-04-23.