Objective Revision Evaluation Service/goodfaith
One of the most critical concerns about Wikimedia's open projects is the detection and removal of damaging contributions. This model was trained on human judgement[1] for whether or not an edit was probably made in good-faith. It is useful for directing newcomer socialization efforts (e.g. en:User:HostBot) and detecting vandals & spammers.
This model is trained to predict good-faith edits. Note that, due to limitations in the field of natural language processing sarcasm and other types of cleverness in vandalism are likely to fool the model. Keep this in mind when consuming scores.
Contexts (wikis)
[edit]English Wikipedia (enwiki)
[edit]https://ores.wmflabs.org/v2/scores/enwiki/goodfaith/?model_info
ScikitLearnClassifier
- type: GradientBoosting
- params: subsample=1.0, max_features="log2", loss="deviance", learning_rate=0.01, center=true, verbose=0, warm_start=false, presort="auto", max_depth=7, scale=true, min_weight_fraction_leaf=0.0, balanced_sample_weight=true, random_state=null, init=null, n_estimators=700, min_samples_leaf=1, max_leaf_nodes=null, min_samples_split=2, balanced_sample=false
- version: 0.3.0
- trained: 2017-01-06T19:35:15.426659
Table:
~False ~True
----- -------- -------
False 428 212
True 1699 17194
Accuracy: 0.902
Precision:
----- -----
False 0.201
True 0.988
----- -----
Recall:
----- -----
False 0.667
True 0.91
----- -----
PR-AUC:
----- -----
False 0.383
True 0.993
----- -----
ROC-AUC:
----- -----
False 0.907
True 0.905
----- -----
Recall @ 0.1 false-positive rate:
label threshold recall fpr
------- ----------- -------- -----
False 0.475 0.688 0.098
True 0.88 0.704 0.097
Recall @ 0.98 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.96 0.046 1
True 0.24 0.977 0.981
Recall @ 0.9 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.957 0.053 0.99
True 0.038 1 0.968
Recall @ 0.45 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.808 0.364 0.481
True 0.038 1 0.968
Recall @ 0.15 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.322 0.777 0.155
True 0.038 1 0.968
Persian Wikipedia (fawiki)
[edit]https://ores.wmflabs.org/v2/scores/fawiki/goodfaith/?model_info
ScikitLearnClassifier
- type: GradientBoosting
- params: presort="auto", max_features="log2", scale=true, max_leaf_nodes=null, init=null, verbose=0, random_state=null, learning_rate=0.01, balanced_sample_weight=true, n_estimators=700, balanced_sample=false, subsample=1.0, warm_start=false, min_samples_leaf=1, max_depth=7, min_weight_fraction_leaf=0.0, loss="deviance", center=true, min_samples_split=2
- version: 0.3.0
- trained: 2017-01-06T20:21:04.924687
Table:
~False ~True
----- -------- -------
False 87 77
True 472 19168
Accuracy: 0.972
Precision:
----- -----
False 0.158
True 0.996
----- -----
Recall:
----- -----
False 0.532
True 0.976
----- -----
PR-AUC:
----- -----
False 0.211
True 0.995
----- -----
ROC-AUC:
----- -----
False 0.974
True 0.964
----- -----
Recall @ 0.1 false-positive rate:
label threshold recall fpr
------- ----------- -------- -----
False 0.09 0.939 0.077
True 0.89 0.922 0.079
Recall @ 0.98 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.953 0.102 1
True 0.051 1 0.992
Recall @ 0.9 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.953 0.102 1
True 0.051 1 0.992
Recall @ 0.45 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.936 0.095 0.633
True 0.051 1 0.992
Recall @ 0.15 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.439 0.652 0.162
True 0.051 1 0.992
Dutch Wikipedia (nlwiki)
[edit]https://ores.wmflabs.org/v2/scores/nlwiki/goodfaith/?model_info
ScikitLearnClassifier
- type: GradientBoosting
- params: loss="deviance", max_features="log2", center=true, warm_start=false, subsample=1.0, scale=true, random_state=null, presort="auto", max_depth=5, min_samples_leaf=1, balanced_sample=false, n_estimators=700, min_samples_split=2, learning_rate=0.01, max_leaf_nodes=null, init=null, min_weight_fraction_leaf=0.0, balanced_sample_weight=true, verbose=0
- version: 0.3.0
- trained: 2017-01-06T21:54:13.608947
Table:
~False ~True
----- -------- -------
False 601 70
True 1500 17293
Accuracy: 0.919
Precision:
----- -----
False 0.286
True 0.996
----- -----
Recall:
----- -----
False 0.896
True 0.92
----- -----
PR-AUC:
----- -----
False 0.677
True 0.995
----- -----
ROC-AUC:
----- -----
False 0.971
True 0.971
----- -----
Recall @ 0.1 false-positive rate:
label threshold recall fpr
------- ----------- -------- -----
False 0.361 0.935 0.094
True 0.5 0.922 0.094
Recall @ 0.98 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.967 0.198 1
True 0.072 0.996 0.981
Recall @ 0.9 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.954 0.302 0.92
True 0.024 1 0.969
Recall @ 0.45 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.803 0.756 0.466
True 0.024 1 0.969
Recall @ 0.15 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.091 0.975 0.171
True 0.024 1 0.969
Polish Wikipedia (plwiki)
[edit]https://ores.wmflabs.org/v2/scores/plwiki/goodfaith/?model_info
ScikitLearnClassifier
- type: RF
- params: center=true, n_estimators=320, max_depth=null, balanced_sample_weight=true, min_samples_split=2, min_samples_leaf=1, verbose=0, min_weight_fraction_leaf=0.0, criterion="entropy", oob_score=false, n_jobs=1, class_weight=null, max_leaf_nodes=null, random_state=null, scale=true, max_features="log2", balanced_sample=false, bootstrap=true, warm_start=false
- version: 0.3.0
- trained: 2017-01-06T22:30:20.768873
Table:
~False ~True
----- -------- -------
False 527 67
True 4 11998
Accuracy: 0.994
Precision:
----- -----
False 0.991
True 0.994
----- -----
Recall:
----- -----
False 0.888
True 1
----- -----
PR-AUC:
----- -----
False 0.953
True 0.995
----- -----
ROC-AUC:
----- -----
False 0.985
True 0.989
----- -----
Recall @ 0.1 false-positive rate:
label threshold recall fpr
------- ----------- -------- -----
False 0.047 0.974 0.062
True 0.675 0.995 0.086
Recall @ 0.98 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.4 0.918 0.991
True 0.293 1 0.989
Recall @ 0.9 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.252 0.923 0.944
True 0.133 1 0.974
Recall @ 0.45 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.062 0.962 0.595
True 0.133 1 0.974
Recall @ 0.15 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.013 0.988 0.199
True 0.133 1 0.974
Portuguese Wikipedia (ptwiki)
[edit]https://ores.wmflabs.org/v2/scores/ptwiki/goodfaith/?model_info
ScikitLearnClassifier
- type: GradientBoosting
- params: scale=true, balanced_sample_weight=true, learning_rate=0.01, min_weight_fraction_leaf=0.0, max_depth=7, center=true, random_state=null, max_leaf_nodes=null, init=null, presort="auto", warm_start=false, min_samples_leaf=1, subsample=1.0, min_samples_split=2, verbose=0, loss="deviance", balanced_sample=false, n_estimators=700, max_features="log2"
- version: 0.3.0
- trained: 2017-01-06T22:48:01.162565
Table:
~False ~True
----- -------- -------
False 935 258
True 2173 16447
Accuracy: 0.877
Precision:
----- -----
False 0.301
True 0.985
----- -----
Recall:
----- -----
False 0.784
True 0.883
----- -----
PR-AUC:
----- -----
False 0.522
True 0.992
----- -----
ROC-AUC:
----- -----
False 0.937
True 0.932
----- -----
Recall @ 0.1 false-positive rate:
label threshold recall fpr
------- ----------- -------- -----
False 0.554 0.744 0.099
True 0.729 0.807 0.096
Recall @ 0.98 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.959 0.053 1
True 0.396 0.916 0.98
Recall @ 0.9 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.95 0.091 0.969
True 0.034 1 0.941
Recall @ 0.45 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.749 0.577 0.457
True 0.034 1 0.941
Recall @ 0.15 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.065 0.976 0.157
True 0.034 1 0.941
Turkish Wikipedia (trwiki)
[edit]https://ores.wmflabs.org/v2/scores/trwiki/goodfaith/?model_info
ScikitLearnClassifier
- type: GradientBoosting
- params: n_estimators=700, min_samples_leaf=1, scale=true, center=true, learning_rate=0.01, init=null, subsample=1.0, max_depth=7, min_weight_fraction_leaf=0.0, balanced_sample_weight=true, max_features="log2", warm_start=false, max_leaf_nodes=null, loss="deviance", balanced_sample=false, random_state=null, verbose=0, presort="auto", min_samples_split=2
- version: 0.3.0
- trained: 2017-01-06T23:29:38.432498
Table:
~False ~True
----- -------- -------
False 714 191
True 2678 16148
Accuracy: 0.855
Precision:
----- -----
False 0.21
True 0.988
----- -----
Recall:
----- -----
False 0.787
True 0.858
----- -----
PR-AUC:
----- -----
False 0.292
True 0.992
----- -----
ROC-AUC:
----- -----
False 0.914
True 0.908
----- -----
Recall @ 0.1 false-positive rate:
label threshold recall fpr
------- ----------- -------- -----
False 0.659 0.656 0.099
True 0.764 0.794 0.095
Recall @ 0.98 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.923 0.021 1
True 0.315 0.91 0.98
Recall @ 0.9 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.923 0.021 1
True 0.08 1 0.955
Recall @ 0.45 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.883 0.111 0.491
True 0.08 1 0.955
Recall @ 0.15 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.128 0.936 0.156
True 0.08 1 0.955
Wikidata (wikidatawiki)
[edit]https://ores.wmflabs.org/v2/scores/wikidatawiki/goodfaith/?model_info
ScikitLearnClassifier
- type: GradientBoosting
- params: balanced_sample=false, center=true, verbose=0, presort="auto", scale=true, init=null, subsample=1.0, random_state=null, min_samples_leaf=1, max_depth=5, loss="deviance", min_weight_fraction_leaf=0.0, max_features="log2", learning_rate=0.1, n_estimators=300, warm_start=false, min_samples_split=2, max_leaf_nodes=null, balanced_sample_weight=true
- version: 0.3.0
- trained: 2017-01-07T00:57:21.651623
Table:
~False ~True
----- -------- -------
False 2091 155
True 1009 21177
Accuracy: 0.952
Precision:
----- -----
False 0.675
True 0.993
----- -----
Recall:
----- -----
False 0.931
True 0.955
----- -----
PR-AUC:
----- -----
False 0.792
True 0.994
----- -----
ROC-AUC:
----- -----
False 0.987
True 0.979
----- -----
Recall @ 0.1 false-positive rate:
label threshold recall fpr
------- ----------- -------- -----
False 0.093 0.986 0.096
True 0.277 0.965 0.096
Recall @ 0.98 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.993 0.034 1
True 0.077 0.974 0.98
Recall @ 0.9 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.99 0.087 0.934
True 0.006 1 0.909
Recall @ 0.45 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.054 0.992 0.471
True 0.006 1 0.909
Recall @ 0.15 precision:
label threshold recall precision
------- ----------- -------- -----------
False 0.006 1 0.245
True 0.006 1 0.909
References
[edit]- ↑ See en:Wikipedia:Labels/Edit quality for the English Wikipedia manual labeling campaign