Objective Revision Evaluation Service/What
ORES, the Objective Revision Evaluation Service, is a service that provides access to various prediction models. A user of the service provides ORES with the name of a wiki, a model to apply and a revision ID to apply the model to. The service responds with a "score" in the form of a JSON document, a data format that is both machine- and human-readable. These "scores" are generally predictions about some characteristic of the revision.
Using the service
[edit]
"642215410": {
"prediction": true,
"probability": {
"false": 0.047408808815936794,
"true": 0.9525911911840632
}
}
ORES response. ORES predicts that this edit
is damaging to the article.
ORES is a web service. You can access via your web browser or make it requests to it within a script, bot or tool. #ORES url structure provides a labeled example of request to apply the damaging model to 642215410 in the English Wikipedia (enwiki). #Diff of 642215410 shows a diff of the revision itself. #ORES response shows that ORES predicts (with ~95% certainty) that the edit is damaging.
Currently the system serves models that predict the qualities of an edit (Will it need to be reverted?, Is it damaging? and Was it made in goodfaith?) and the quality of an article as of a revision (Which article quality assessment rating is most likely right?). Models for predicting edit types are in development.
Supervised machine learning -- in a nutshell
[edit]ORES' prediction models are trained by showing them many observations that have been labeled by humans. The prediction models are then used to make predictions about new observations. This process is commonly referred to as supervised machine learning. #Supervised machine learning data flow shows the flow of data in a step-by-step process to "train" and "test" a "prediction model". From the diagram:
- Humans provide judgements via labels (e.g. Damaging or not, FA-class/Stub-class article, etc.). These judgements are collecting into a set of "labeled observations"
- Labeled observations are split into a "training set" and "test set". The "training set" is provided to the machine learner to learn from. The "test set" is held aside for future use.
- The machine learner uses the "training set" to learn how to predict the provided labels (i.e. learns to replicate human judgements) and a "prediction model" is produced. This prediction model can be used to automatically label new observations.
- The previously withheld "test set" is given to the "prediction model" and the "predicted labels" are compared to the "human labels" to generate statistics about the model's performance.
Note that both the "prediction model" and "stats" from testing are outlined. This is because the two are kept together and use within the ORES system. You can check a model's statistics by dropping the revision ID from the path. E.g. https://ores.wmflabs.org/scores/enwiki/damaging/