Research talk:Revision scoring as a service/Work log/2016-02-09

From Meta, a Wikimedia project coordination wiki

General model[edit]

        revscoring train_test \
                revscoring.scorer_models.RF \
                wb_vandalism.feature_lists.experimental.general \
                --version 0.0.1 \
                -p 'max_features="log2"' \
                -p 'criterion="entropy"' \
                -p 'min_samples_leaf=1' \
                -p 'n_estimators=80' \
                -s 'pr' -s 'roc' \
                -s 'recall_at_fpr(max_fpr=0.10)' \
                -s 'filter_rate_at_recall(min_recall=0.90)' \
                -s 'filter_rate_at_recall(min_recall=0.75)' \
                --balance-sample-weight \
                --center --scale \
                --label-type=bool > \
        models/models/wikidata.reverted.general.rf.model
2016-02-09 15:09:16,584 INFO:revscoring.utilities.train_test -- Training model...
2016-02-09 15:10:51,419 INFO:revscoring.utilities.train_test -- Testing model...
ScikitLearnClassifier
 - type: RF
 - params: max_features="log2", max_depth=null, n_estimators=80, balanced_sample_weight=true, random_state=null, bootstrap=true, oob_score=false, warm_start=false, min_samples_split=2, scale=true, class_weight=null, max_leaf_nodes=null, center=true, criterion="entropy", min_weight_fraction_leaf=0.0, n_jobs=1, verbose=0, min_samples_leaf=1
 - version: 0.0.1
 - trained: 2016-02-09T15:10:51.415906

         ~False    ~True
-----  --------  -------
False     98365      762
True        139        6

Accuracy: 0.9909239261826094

Filter rate @ 0.75 recall: threshold=0.0, filter_rate=0.0, recall=1.0
ROC-AUC: 0.815
PR-AUC: 0.015
Recall @ 0.1 false-positive rate: threshold=None, recall=None, fpr=None
Filter rate @ 0.9 recall: threshold=0.0, filter_rate=0.0, recall=1.0

Humbly Amir (talk) 15:51, 9 February 2016 (UTC)[reply]