Jump to content

Objective Revision Evaluation Service/damaging

From Meta, a Wikimedia project coordination wiki

This model was trained on human judgement[1] for whether or not an edit is damaging. It is useful for quality control tools (e.g. en:WP:Huggle and en:User:ClueBot NG)

This model is trained to predict damaging edits. Not all damaging edits are bad-faith. Consume scores with this in mind and see ORES/goodfaith for a model which predicts if an edit was made in good-faith.

Contexts (wikis)

[edit]

English Wikipedia (enwiki)

[edit]

https://ores.wmflabs.org/v2/scores/enwiki/damaging/?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: loss="deviance", balanced_sample=false, subsample=1.0, warm_start=false, learning_rate=0.01, n_estimators=700, max_features="log2", balanced_sample_weight=true, presort="auto", init=null, min_samples_split=2, center=true, min_weight_fraction_leaf=0.0, scale=true, max_leaf_nodes=null, max_depth=7, min_samples_leaf=1, random_state=null, verbose=0
 - version: 0.3.0
 - trained: 2017-01-06T19:29:12.793824

Table:
                 ~False    ~True
        -----  --------  -------
        False     17203     1664
        True        211      455

Accuracy: 0.904
Precision:
        -----  -----
        False  0.988
        True   0.215
        -----  -----

Recall:
        -----  -----
        False  0.912
        True   0.683
        -----  -----

PR-AUC:
        -----  -----
        False  0.993
        True   0.391
        -----  -----

ROC-AUC:
        -----  -----
        False  0.918
        True   0.919
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.862     0.76   0.093
        True           0.469     0.713  0.098

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.259     0.972         0.98
        True           0.955     0.068         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.04      1            0.967
        True           0.945     0.091        0.98

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.04      1            0.967
        True           0.831     0.326        0.468

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.04      1            0.967
        True           0.281     0.828        0.154

Persian Wikipedia (fawiki)

[edit]

https://ores.wmflabs.org/v2/scores/fawiki/damaging/?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: min_samples_leaf=1, warm_start=false, verbose=0, min_weight_fraction_leaf=0.0, max_leaf_nodes=null, min_samples_split=2, scale=true, max_depth=7, center=true, max_features="log2", balanced_sample=false, init=null, loss="deviance", n_estimators=700, random_state=null, balanced_sample_weight=true, subsample=1.0, presort="auto", learning_rate=0.01
 - version: 0.3.0
 - trained: 2017-01-06T20:15:09.126168

Table:
                 ~False    ~True
        -----  --------  -------
        False     18876      685
        True         98      145

Accuracy: 0.96
Precision:
        -----  -----
        False  0.995
        True   0.178
        -----  -----

Recall:
        -----  -----
        False  0.965
        True   0.607
        -----  -----

PR-AUC:
        -----  -----
        False  0.995
        True   0.228
        -----  -----

ROC-AUC:
        -----  -----
        False  0.959
        True   0.968
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.866     0.913   0.09
        True           0.092     0.927   0.09

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.042     1            0.988
        True           0.963     0.043        1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.042     1            0.988
        True           0.963     0.043        1

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.042     1            0.988
        True           0.922     0.143        0.592

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.042     1            0.988
        True           0.313     0.797        0.156

Dutch Wikipedia (nlwiki)

[edit]

https://ores.wmflabs.org/v2/scores/nlwiki/damaging/?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: warm_start=false, learning_rate=0.01, n_estimators=700, loss="deviance", verbose=0, max_leaf_nodes=null, min_weight_fraction_leaf=0.0, init=null, presort="auto", scale=true, min_samples_split=2, max_depth=5, center=true, balanced_sample=false, random_state=null, max_features="log2", balanced_sample_weight=true, min_samples_leaf=1, subsample=1.0
 - version: 0.3.0
 - trained: 2017-01-06T21:49:27.818262

Table:
                 ~False    ~True
        -----  --------  -------
        False     16763     1714
        True        105      882

Accuracy: 0.907
Precision:
        -----  -----
        False  0.994
        True   0.339
        -----  -----

Recall:
        -----  -----
        False  0.907
        True   0.895
        -----  -----

PR-AUC:
        -----  -----
        False  0.994
        True   0.646
        -----  -----

ROC-AUC:
        -----  -----
        False  0.96
        True   0.963
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.528     0.905  0.096
        True           0.446     0.912  0.098

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.152     0.969         0.98
        True           0.966     0.109         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.03      1            0.952
        True           0.952     0.211        0.936

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False           0.03     1            0.952
        True            0.8      0.739        0.455

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False           0.03     1            0.952
        True            0.1      0.977        0.161

Polish Wikipedia (plwiki)

[edit]

https://ores.wmflabs.org/v2/scores/plwiki/damaging/?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: warm_start=false, max_depth=7, presort="auto", n_estimators=700, random_state=null, min_samples_leaf=1, min_weight_fraction_leaf=0.0, min_samples_split=2, init=null, max_features="log2", learning_rate=0.01, scale=true, verbose=0, max_leaf_nodes=null, loss="deviance", balanced_sample_weight=true, subsample=1.0, center=true, balanced_sample=false
 - version: 0.3.0
 - trained: 2017-01-06T22:25:42.924606

Table:
                 ~False    ~True
        -----  --------  -------
        False     11527      337
        True         56      676

Accuracy: 0.969
Precision:
        -----  -----
        False  0.995
        True   0.667
        -----  -----

Recall:
        -----  -----
        False  0.972
        True   0.923
        -----  -----

PR-AUC:
        -----  -----
        False  0.995
        True   0.924
        -----  -----

ROC-AUC:
        -----  -----
        False  0.983
        True   0.977
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.366     0.987  0.093
        True           0.304     0.95   0.086

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.124     0.999        0.981
        True           0.857     0.719        0.99

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.052     1            0.956
        True           0.736     0.859        0.911

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.052     1            0.956
        True           0.371     0.942        0.494

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.052     1            0.956
        True           0.097     0.979        0.198

Portuguese Wikipedia (ptwiki)

[edit]

https://ores.wmflabs.org/v2/scores/ptwiki/damaging/?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: center=true, min_samples_split=2, balanced_sample_weight=true, scale=true, min_samples_leaf=1, warm_start=false, n_estimators=700, subsample=1.0, init=null, learning_rate=0.01, random_state=null, verbose=0, max_features="log2", presort="auto", loss="deviance", max_leaf_nodes=null, max_depth=7, min_weight_fraction_leaf=0.0, balanced_sample=false
 - version: 0.3.0
 - trained: 2017-01-06T22:41:56.417405

Table:
                 ~False    ~True
        -----  --------  -------
        False     16008     2439
        True        276     1090

Accuracy: 0.863
Precision:
        -----  -----
        False  0.983
        True   0.309
        -----  -----

Recall:
        -----  -----
        False  0.868
        True   0.798
        -----  -----

PR-AUC:
        -----  -----
        False  0.991
        True   0.52
        -----  -----

ROC-AUC:
        -----  -----
        False  0.926
        True   0.931
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.711     0.79   0.097
        True           0.591     0.727  0.098

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.439     0.891         0.98
        True           0.957     0.039         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.033     1            0.932
        True           0.952     0.056        0.946

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.033     1            0.932
        True           0.734     0.578        0.453

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.033     1            0.932
        True           0.048     0.991        0.154

Russian Wikipedia (ruwiki)

[edit]

https://ores.wmflabs.org/v2/scores/ruwiki/damaging/?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: balanced_sample=false, warm_start=false, subsample=1.0, n_estimators=700, max_depth=5, scale=true, learning_rate=0.01, random_state=null, min_samples_leaf=1, init=null, balanced_sample_weight=true, presort="auto", max_features="log2", loss="deviance", max_leaf_nodes=null, min_weight_fraction_leaf=0.0, center=true, min_samples_split=2, verbose=0
 - version: 0.3.0
 - trained: 2017-01-06T22:57:34.396636

Table:
                 ~False    ~True
        -----  --------  -------
        False     15856     2825
        True        115      939

Accuracy: 0.851
Precision:
        -----  -----
        False  0.993
        True   0.25
        -----  -----

Recall:
        -----  -----
        False  0.849
        True   0.89
        -----  -----

PR-AUC:
        -----  -----
        False  0.992
        True   0.436
        -----  -----

ROC-AUC:
        -----  -----
        False  0.933
        True   0.94
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.542     0.843  0.094
        True           0.737     0.728  0.099

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.232     0.916         0.98
        True           0.951     0.031         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.044     1            0.947
        True           0.949     0.043        0.985

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.044     1            0.947
        True           0.855     0.406        0.456

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.044     1            0.947
        True           0.088     0.995        0.166


Turkish Wikipedia (trwiki)

[edit]

https://ores.wmflabs.org/v2/scores/trwiki/damaging/?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: balanced_sample_weight=true, n_estimators=700, learning_rate=0.01, subsample=1.0, warm_start=false, verbose=0, loss="deviance", max_leaf_nodes=null, max_features="log2", min_weight_fraction_leaf=0.0, min_samples_split=2, presort="auto", init=null, min_samples_leaf=1, center=true, balanced_sample=false, scale=true, max_depth=7, random_state=null
 - version: 0.3.0
 - trained: 2017-01-06T23:24:25.903527

Table:
                 ~False    ~True
        -----  --------  -------
        False     16069     2688
        True        216      758

Accuracy: 0.853
Precision:
        -----  -----
        False  0.987
        True   0.22
        -----  -----

Recall:
        -----  -----
        False  0.857
        True   0.777
        -----  -----

PR-AUC:
        -----  -----
        False  0.992
        True   0.309
        -----  -----

ROC-AUC:
        -----  -----
        False  0.909
        True   0.914
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.758     0.798  0.096
        True           0.651     0.664  0.099

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.337     0.904         0.98
        True           0.928     0.021         1

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.067     1            0.951
        True           0.928     0.021        1

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.067     1            0.951
        True           0.872     0.168        0.459

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.067     1            0.951
        True           0.104     0.944        0.153


Wikidata (wikidatawiki)

[edit]

https://ores.wmflabs.org/v2/scores/wikidatawiki/damaging/?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: balanced_sample=false, min_samples_split=2, verbose=0, warm_start=false, center=true, random_state=null, presort="auto", scale=true, balanced_sample_weight=true, min_weight_fraction_leaf=0.0, init=null, max_depth=7, max_leaf_nodes=null, min_samples_leaf=1, max_features="log2", n_estimators=700, loss="deviance", learning_rate=0.01, subsample=1.0
 - version: 0.3.0
 - trained: 2017-01-07T00:52:58.800624

Table:
                 ~False    ~True
        -----  --------  -------
        False     21094      690
        True        150     2498

Accuracy: 0.966
Precision:
        -----  -----
        False  0.993
        True   0.784
        -----  -----

Recall:
        -----  -----
        False  0.968
        True   0.943
        -----  -----

PR-AUC:
        -----  -----
        False  0.994
        True   0.885
        -----  -----

ROC-AUC:
        -----  -----
        False  0.986
        True   0.991
        -----  -----

Recall @ 0.1 false-positive rate:
        label      threshold    recall    fpr
        -------  -----------  --------  -----
        False          0.13      0.984  0.094
        True           0.124     0.988  0.092

Recall @ 0.98 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.05      0.986        0.98
        True           0.987     0.062        0.998

Recall @ 0.9 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.015     0.999        0.903
        True           0.945     0.497        0.923

Recall @ 0.45 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.012     1            0.894
        True           0.057     0.996        0.467

Recall @ 0.15 precision:
        label      threshold    recall    precision
        -------  -----------  --------  -----------
        False          0.012         1        0.894
        True           0.008         1        0.256

References

[edit]
  1. See en:Wikipedia:Labels/Edit quality for the English Wikipedia manual labeling campaign