Objective Revision Evaluation Service/wp10

From Meta, a Wikimedia project coordination wiki

Article quality in Wikipedia is of critical concern. Many wikis implement a features article process (e.g., en:Wikipedia:Featured article) for identifying high quality content and many WikiProjects use quality assessments to prioritize and direct work. But this assessment work is extremely time intensive and assessments will become out of date. This model makes predictions about article quality to support these processes.

Contexts[edit]

English Wikipedia (enwiki)[edit]

See en:Wikipedia:Version_1.0_Editorial_Team/Assessment for discussion of the rating scale. See also work by Warncke-Wang et al.[1]
https://ores.wmflabs.org/v2/scores/enwiki/wp10/?model_info

ScikitLearnClassifier
 - type: RF
 - params: random_state=null, oob_score=false, warm_start=false, center=true, balanced_sample_weight=false, bootstrap=true, max_depth=null, criterion="entropy", balanced_sample=true, scale=true, min_weight_fraction_leaf=0.0, verbose=0, min_samples_split=2, min_samples_leaf=8, max_leaf_nodes=null, n_estimators=320, max_features="log2", class_weight=null, n_jobs=1
 - version: 0.3.1
 - trained: 2016-03-13T14:09:52.020303

Table:
	         ~B    ~C    ~FA    ~GA    ~Start    ~Stub
	-----  ----  ----  -----  -----  --------  -------
	B       378   263     85    164       123       13
	C       178   452     35    126       182        8
	FA       54     7    652    154         1        1
	GA       43    52    279    603         4        0
	Start    57   161      1     14       682      106
	Stub      1     7      0      0       123      873

Accuracy: 0.619
ROC-AUC:
	-------  -----
	'B'      0.829
	'C'      0.844
	'FA'     0.95
	'GA'     0.913
	'Start'  0.908
	'Stub'   0.984
	-------  -----

F1:
	-----  -----
	Start  0.639
	GA     0.591
	B      0.435
	Stub   0.871
	FA     0.679
	C      0.47
	-----  -----

French Wikipedia (frwiki)[edit]

See fr:Projet:Évaluation for a discussion the rating scale.
https://ores.wmflabs.org/v2/scores/frwiki/wp10/?model_info

ScikitLearnClassifier
 - type: RF
 - params: warm_start=false, max_depth=null, criterion="gini", oob_score=false, random_state=null, class_weight=null, max_features="auto", center=true, n_estimators=501, verbose=0, min_weight_fraction_leaf=0.0, min_samples_split=2, bootstrap=true, n_jobs=1, balanced_sample=true, scale=true, min_samples_leaf=8, balanced_sample_weight=false, max_leaf_nodes=null
 - version: 0.2.0
 - trained: 2016-03-13T14:37:22.965656

Table:
	       ~a    ~adq    ~b    ~ba    ~bd    ~e
	---  ----  ------  ----  -----  -----  ----
	a      32      84    48     88     19     4
	adq    26     217    11     81      0     0
	b      23      21   144     43     65    11
	ba     30      83    21    171      2     2
	bd      2       1    50      3    161    56
	e       0       0    11      1     53   226

Accuracy: 0.531
ROC-AUC:
	-----  -----
	'a'    0.719
	'adq'  0.891
	'b'    0.838
	'ba'   0.848
	'bd'   0.905
	'e'    0.959
	-----  -----

F1:
	---  -----
	e    0.766
	a    0.165
	b    0.486
	adq  0.586
	ba   0.491
	bd   0.562
	---  -----


Russian Wikipedia (ruwiki)[edit]

https://ores.wmflabs.org/v2/scores/ruwiki/wp10/?model_info

ScikitLearnClassifier
 - type: RF
 - params: n_estimators=501, oob_score=false, scale=true, min_samples_split=2, verbose=0, class_weight=null, random_state=null, criterion="gini", max_depth=null, n_jobs=1, min_weight_fraction_leaf=0.0, balanced_sample=true, min_samples_leaf=8, max_features="auto", max_leaf_nodes=null, balanced_sample_weight=false, center=true, warm_start=false, bootstrap=true
 - version: 0.0.1
 - trained: 2016-06-07T17:34:08.180792

Table:
	       ~I    ~II    ~III    ~IV    ~fa    ~ga    ~sa
	---  ----  -----  ------  -----  -----  -----  -----
	I      36     46      24      3     23     47     39
	II     28     75      31      6      8     20     37
	III     7     48     117     36      2      0     22
	IV      1      7      57    157      1      0      5
	fa      6      1       1      1    158     50      4
	ga     19      5       7      3     50    143     16
	sa      6     12       3      0      0     17    207

Accuracy: 0.561
ROC-AUC:
	-----  -----
	'I'    0.73
	'II'   0.782
	'III'  0.868
	'IV'   0.956
	'fa'   0.939
	'ga'   0.888
	'sa'   0.956
	-----  -----

F1:
	---  -----
	IV   0.724
	fa   0.683
	II   0.376
	I    0.224
	ga   0.55
	III  0.496
	sa   0.72
	---  -----

References[edit]

  1. Appendix A in Warncke-Wang, M., Ayukaev, V. R., Hecht, B., and Terveen, L. (2015). The Success and Failure of Quality Improvement Projects in Peer Production Communities (PDF). CSCW. , as well as Warncke-Wang, M., Cosley, D., and Riedl, J. (2013). Tell Me More: An Actionable Quality Model for Wikipedia. OpenSym/WikiSym.