Enwiki Goodfaith Model Card

This model card was created and written entirely by the Algorithmic Accountability Bot account. The bot is operated by Hal Triedman. It regularly creates and updates pages about the provenance and statistical performance of machine learning models and datasets owned by the Wikimedia Foundation.

Note: Any and all edits to this page will be overwritten the next time it is updated. Please put all questions and discussion of this algorithmic component in the talk page, or else contact Hal or the WMF ML team directly.

This is a Tier 1 model card. That means it is generated by testing the model against a more recent, independent test set, which provides the deepest possible dive into how the model is working on MediaWiki right now.

This tier also includes:
- Detailed statistics of model performance at training time
- Full description of model architecture
- An in-depth explanation of the model rationale, owners, creators, provenance, etc.

Qualitative Analysis

What is the motivation behind creating this model?

To prioritize review of potentially damaging edits or vandalism. The model provides a guess at whether or not a given revision is damaging, and provides some probabilities to serve as a measure of its confidence level.

Who created this model?

Aaron Halfaker (aaron.halfaker@gmail.com) and Amir Sarabadani (amir.sarabadani@wikimedia.de).

Who currently owns/is responsible for this model?

WMF Machine Learning Team (ml@wikimediafoundation.org)

Who are the intended users of this model?

English Wikipedia uses the model as a service for facilitating efficient edit reviews. On an individual basis, anyone can submit a properly-formatted API call to ORES for a given revision and get back the result of this model.

What should this model be used for?

This model should be used for prioritizing the review and potential reversion of damaging edits and vandalism on English Wikipedia.

What should this model not be used for?

This model should not be used as an ultimate arbiter of whether or not an edit ought to be considered damaging. It should not be used for any other English- language wiki besides English Wikipedia, and shouldn't be used for other languages.

What community approval processes has this model gone through?

English Wikipedia decided (note: don't know where/when this decision was made, would love to find a link to that discussion) to use this model. Over time, the model has been validated through use in the community. The links below are just an example to show what this product might look like.

Dates of consideration forums

2021-09-07

What internal or external changes could make this model deprecated or no longer usable?

Data drift means training data for the model is no longer usable.
Doesn't meet desired performance metrics in production.
English Wikipedia community decides to not use this model anymore.

How should this model be licensed?

Creative Commons Attribution ShareAlike 3.0

If this model is retrained, can we see how it has changed over time?

To my knowledge, this model has not been retrained over time — it still uses the original dataset from 2014-2015.

How does this model mitigate data drift?

This model does not mitigate data drift.

Which service(s) rely on this model?

This model is one of many models that powers ORES, the Wikimedia Foundation's machine machine learning API.

Learn more about ORES here

Which dataset(s) does this model rely on?

This model was trained using hand-labeled training data from 2014-2015. It was tested on a small sample of data from a later hand-labeling campaign from 2015-2016.

Train dataset is available for download here

Test dataset is available for download here

Quantitative Analysis

How did the model perform on training data?

counts (n=19230):
		label        n         ~True    ~False
		-------  -----  ---  -------  --------
		True     18724  -->    18404       320
		False      506  -->      261       245
	rates:
		              True    False
		----------  ------  -------
		sample       0.974    0.026
		population   0.967    0.033
	match_rate (micro=0.937, macro=0.5):
		  True    False
		------  -------
		 0.968    0.032
	filter_rate (micro=0.063, macro=0.5):
		  True    False
		------  -------
		 0.032    0.968
	recall (micro=0.967, macro=0.734):
		  True    False
		------  -------
		 0.983    0.484
	!recall (micro=0.501, macro=0.734):
		  True    False
		------  -------
		 0.484    0.983
	precision (micro=0.966, macro=0.736):
		  True    False
		------  -------
		 0.982     0.49
	!precision (micro=0.506, macro=0.736):
		  True    False
		------  -------
		  0.49    0.982
	f1 (micro=0.966, macro=0.735):
		  True    False
		------  -------
		 0.983    0.487
	!f1 (micro=0.503, macro=0.735):
		  True    False
		------  -------
		 0.487    0.983
	accuracy (micro=0.967, macro=0.967):
		  True    False
		------  -------
		 0.967    0.967
	fpr (micro=0.499, macro=0.266):
		  True    False
		------  -------
		 0.516    0.017
	roc_auc (micro=0.924, macro=0.924):
		  True    False
		------  -------
		 0.924    0.924
	pr_auc (micro=0.979, macro=0.735):
		  True    False
		------  -------
		 0.997    0.474

How does the model perform on test/real world data across different geographies, different devices, etc.?

	AUC Score	Overall accuracy	Negative sample precision	Negative sample recall	Negative sample f1-score	Negative sample support	Positive sample precision	Positive sample recall	Positive sample f1-score	Positive sample support	True Positives	True Negatives	False Positives	False Negatives	True Positive Rate (Sensitivity)	True Negative Rate (Specificity)	False Positive Rate	False Negative Rate	Positive Predictive Value	Negative Predictive Value
All data	0.645	0.792	0.272	0.176	0.214	17	0.852	0.910	0.880	89	81	3	14	8	0.910	0.176	0.823	0.089	0.852	0.272
New editors (<1 year)	0.630	0.767	0.272	0.2	0.230	15	0.84	0.887	0.863	71	63	3	12	8	0.887	0.2	0.8	0.112	0.84	0.272
Experienced editors (>=1 year)	0.638	0.9	0.0	0.0	0.0	2	0.9	1.0	0.947	18	18	0	2	0	1.0	0.0	1.0	0.0	0.9	nan
Anonymous editors	0.702	0.75	0.272	0.25	0.260	12	0.842	0.857	0.849	56	48	3	9	8	0.857	0.25	0.75	0.142	0.842	0.272
Named editors	0.563	0.868	0.0	0.0	0.0	5	0.868	1.0	0.929	33	33	0	5	0	1.0	0.0	1.0	0.0	0.868	nan
Mobile editors	0.654	0.6	0.25	0.166	0.2	6	0.687	0.785	0.733	14	11	1	5	3	0.785	0.166	0.833	0.214	0.687	0.25
Desktop editors	0.607	0.837	0.285	0.181	0.222	11	0.886	0.933	0.909	75	70	2	9	5	0.933	0.181	0.818	0.066	0.886	0.285

Model Information

What is the architecture of this model?

{
    "type": "GradientBoosting",
    "version": "0.5.1",
    "params": {
        "scale": true,
        "center": true,
        "labels": [
            true,
            false
        ],
        "multilabel": false,
        "population_rates": null,
        "ccp_alpha": 0.0,
        "criterion": "friedman_mse",
        "init": null,
        "learning_rate": 0.01,
        "loss": "deviance",
        "max_depth": 7,
        "max_features": "log2",
        "max_leaf_nodes": null,
        "min_impurity_decrease": 0.0,
        "min_impurity_split": null,
        "min_samples_leaf": 1,
        "min_samples_split": 2,
        "min_weight_fraction_leaf": 0.0,
        "n_estimators": 700,
        "n_iter_no_change": null,
        "presort": "deprecated",
        "random_state": null,
        "subsample": 1.0,
        "tol": 0.0001,
        "validation_fraction": 0.1,
        "verbose": 0,
        "warm_start": false
    }
}

What is the score schema this model returns?

{
    "title": "Scikit learn-based classifier score with probability",
    "type": "object",
    "properties": {
        "prediction": {
            "description": "The most likely label predicted by the estimator",
            "type": "boolean"
        },
        "probability": {
            "description": "A mapping of probabilities onto each of the potential output labels",
            "type": "object",
            "properties": {
                "true": {
                    "type": "number"
                },
                "false": {
                    "type": "number"
                }
            }
        }
    }
}