Research talk:Automated classification of edit quality

From Meta, a Wikimedia project coordination wiki

Discussions of disparate impact on anonymous and new editors[edit]

The damage detection models that ORES supports seems to be overly skeptical of edits by anonymous editors and newcomers. I wanted to post to leave some notes linking to my thoughts on this problem and what our next steps should be.

Blog: http://socio-technologist.blogspot.com/2015/12/disparate-impact-of-damage-detection-on.html

Plan:

  1. Remove user.age and user.is_anon and generate new models Done (see Phab:T120138)
  2. Tune models (both with and without user.age and user.is_anon) and re-test against reported false positives
  3. Based on results, propose next steps to users.

I'll be posting some work logs here with details about my work on this. --Halfak (WMF) (talk) 19:56, 10 December 2015 (UTC)[reply]

Any news on this, Halfak (WMF)? You manually fixed the enwiki and wikidata models? ("I ran a test with the new term frequency features and was able to bring the AUC of enwiki damage detection back up to .88 AUC. More testing is needed and other wikis, of course. (...) It looks like I can drop user.is_anon and user.age from the wikidatawiki models and maintain 0.95 AUC" T120138) But for the rest of the wikis, we're left with "objective revision evaluation service" that is biased against newbies? --Atlasowa (talk) 06:45, 7 February 2016 (UTC)[reply]
Hey Atlasowa! Good Q and I'm glad you're interested pushing this forward. I haven't been able to focus on this project to the exclusion of others since I made that last update re. removing user.age and user.is_anon features. I've managed to get all of the models updated and ready for deployment, but all of deployed models currently use user.age and user.is_anon. The problem is that I need to do some analysis to find out if removing the problematic features did anything helpful or just made the classifier slightly less fit. That will take about 16-32 work hours to iterate on my thoughts around the methodology. In the meantime, I don't want to just blindly deploy the new classifiers just because I *think* they'll have less of a negative impact. FWIW, I'm working on checking how this changes the fitness of *all of the models* -- not just enwiki and wikidatawiki. I only reported those two because they are the wikis where it is easiest for me to perform qualitative analyses (i.e. using my eyeballs to make sure the algorithm works). I'm monitoring fitness statistics with the others and I've seen similar, minor losses in fitness given the new term frequency features.
I'm currently working on a paper about the WikiData's models because we're pushing on a deadline there. See Research:Building automated vandalism detection tool for Wikidata. Once I have finished the analysis there, I'll continue my work on this study of disparate impact. Sorry for the delay. Wish I had a cloning machine.  :/ --Halfak (WMF) (talk) 18:13, 7 February 2016 (UTC)[reply]
Hey Atlasowa. We just did a deployment (Aug. 7th -- see Talk:Objective_Revision_Evaluation_Service#Updates to ORES service & BREAKING CHANGE on April 7th) that put new models in place that should minimize this impact. I'm just working on a report discussing the difference with the new models now. I'll ping again when my notes are ready. --Halfak (WMF) (talk) 20:27, 14 April 2016 (UTC)[reply]
See my report: Research_talk:Automated_classification_of_edit_quality/Work_log/2016-04-14. I'll be on vacation for ~ a week, but I'll be extending this analysis as soon as I get back. I'm hoping to write a paper about this work for CSCW 2017 -- deadline end of May. I'll be drafting it on-wiki. Will post a link as I get started working on it. --Halfak (WMF) (talk) 21:18, 14 April 2016 (UTC)[reply]