Research talk:Automated classification of edit quality/Work log/2016-04-14

From Meta, a Wikimedia project coordination wiki

Thursday, April 14, 2016[edit]

Today, I am looking at the bias baked into the old LinearSVC models that appears to be minimized by the new GradientBoosting models.

So, the basis of this bias is characteristics of the user that may be correlated, but otherwise have nothing to do with the quality of an edit. We want to see where the bias is most prevalent and what changes the new gradient boosting model has had.

So, in order to explore this, I'm going to take the test data that I have labeled via Wiki labels and experiment with switching out user-related features. I've built up 4 cannonical users:

Anonymous user.
The system generates a zero for seconds since registration.
anon_cache = \
	'{"feature.revision.user.is_anon": true, \
	  "feature.temporal.revision.user.seconds_since_registration": 0, \
	  "feature.revision.user.has_advanced_rights": false, \
	  "feature.revision.user.is_admin": false, \
	  "feature.revision.user.is_bot": false, \
	  "feature.revision.user.is_curator": false}'
Newcomer
~2 hours after registration
newcomer_cache = \
	'{"feature.revision.user.is_anon": false, \
	  "feature.temporal.revision.user.seconds_since_registration": 18000, \
	  "feature.revision.user.has_advanced_rights": false, \
	  "feature.revision.user.is_admin": false, \
	  "feature.revision.user.is_bot": false, \
	  "feature.revision.user.is_curator": false}'
EpochFail (Me)
Not an admin, but I registered a Loooong time ago
epochfail_cache = \
	'{"feature.revision.user.is_anon": false, \
	  "feature.temporal.revision.user.seconds_since_registration": 257995021, \
	  "feature.revision.user.has_advanced_rights": false, \
	  "feature.revision.user.is_admin": false, \
	  "feature.revision.user.is_bot": false, \
	  "feature.revision.user.is_curator": false}'
Admin
But what if I was an admin?
admin_cache = \
	'{"feature.revision.user.is_anon": false, \
	  "feature.temporal.revision.user.seconds_since_registration": 257995021, \
	  "feature.revision.user.has_advanced_rights": true, \
	  "feature.revision.user.is_admin": true, \
	  "feature.revision.user.is_bot": false, \
	  "feature.revision.user.is_curator": true}'

OK... Now to apply these while scoring and see the difference.

Results[edit]

Natural[edit]

This data is generated using the *real* user information associated with the edit.

Damaging prediction scores for labeled edits from the test set using Gradient Boosting and Support Vector Classifier models.
Damaging prediction (natural). Damaging prediction scores for labeled edits from the test set using Gradient Boosting and Support Vector Classifier models.

The x axis is the "probability that the edit is damaging" as predicted by the classifier model. The y axis is the frequency of that prediction. You can see that I've split up edits by whether or not they were labeled as "damaging" by Wikipedians using Wiki labels.

Here we can see that that both models seem to distinguish between damage and non-damage effectively. However, there's a weird spike in the density of non-damaging edits with ~80% true_proba under the SVC model. Still, generally, these plots suggest that the models will work in practice.

Anons[edit]

Now, let's see what the scores look like if all the edits were saved by anonymous editors. Note that all other features are held constant when scoring.

Damaging prediction scores for labeled edits from the test set using Gradient Boosting and Support Vector Classifier models (anon stats)
Damaging prediction (anon). Damaging prediction scores for labeled edits from the test set using Gradient Boosting and Support Vector Classifier models (anon stats)
Damaging prediction scores for labeled edits from the test set using Gradient Boosting model (anon stats)
Damaging prediction (anon & gb only). Damaging prediction scores for labeled edits from the test set using Gradient Boosting model (anon stats)

Woah! Look at that spike on the right side of #Damaging prediction (anon)! That shows that, when an edit is saved by an anonymous editor, it is flagged with nearly 100% damage probability every time. Yikes! #Damaging prediction (anon & gb only) shows just the Gradient Boosting model. Here we can see that, while there is substantial overlap between the distributions, there's clear differentiation between damaging and non-damaging edits. This is likely good news for anons.

Newcomers[edit]

So, how much do we attribute to a registered account. In this case, we're asking about newcomers who registered their account two hours before saving an edit.

Damaging prediction scores for labeled edits from the test set using Gradient Boosting and Support Vector Classifier models (newcomer stats)
Damaging prediction (anon). Damaging prediction scores for labeled edits from the test set using Gradient Boosting and Support Vector Classifier models (newcomer stats)

The Gradient Boosting model looks a lot like it did for anonymous editors. This is something that I'd expect. However, the Linear SVC model has a pretty weird spike! While it looks like the Linear SVC makes some minor differentiation between damaging and not, but there's a ton of overlap.

EpochFail[edit]

But can we detect damage when someone as experienced and prestigious as me (ha!) saves it?

Damaging prediction scores for labeled edits from the test set using Gradient Boosting and Support Vector Classifier models (User:EpochFail stats)
Damaging prediction (anon). Damaging prediction scores for labeled edits from the test set using Gradient Boosting and Support Vector Classifier models (User:EpochFail stats)

It looks like the LinearSVC would really struggle to flag *any* vandalism if I'm the one who saves it. The gradient boosting model certainly is showing lower probabilities for the damaging edits, but it looks like we can still differentiate damaging from not-damaging pretty well.

Admin[edit]

OK. Now what if I were an admin?

Damaging prediction scores for labeled edits from the test set using Gradient Boosting and Support Vector Classifier models (admin stats)
Damaging prediction (anon). Damaging prediction scores for labeled edits from the test set using Gradient Boosting and Support Vector Classifier models (admin stats)

It looks like neither model will ever consider the work of an admin to be damaging. This isn't too surprising and it probably works out in practice to say that, if you have gotten advanced user rights, you probably aren't damaging the wiki no matter what your edit looks like, but it's still a little bit problematic for the model to learn this so strongly.


OK. That's all for today. I think that next time, I'll be looking into examples of the types of edits whose scores change the most/least between user types and modeling strategies. --EpochFail (talk) 21:16, 14 April 2016 (UTC)[reply]