Research talk:Revision scoring as a service/Work log/2016-01-29
Friday, January 29, 2016
OKAY! Today I'm going to load a new labeling campaign into Wiki labels for fawiki so that we can get some more observations of damage for training and testing.
So, I've already built a sample. See https://quarry.wmflabs.org/query/7046. Then I ran the prelabeling script to filter out all edits by trusted users and flag edits that are saved by blocked users, reverted or otherwise unknown for review. Using this, we've filtered the 20k revisions down to 3156.
Now, time to make a new campaign. Luckily, I remembered to ask User:Ladsgroup to give me a campaign name so that I don't have to hack something together with google translate:
'کیفیت ویرایش نسخه ۲ (نمونه تصادفی ۲۰ هزارتایی،۲۰۱۵)'.
u_wikilabels=> INSERT INTO campaign (name, wiki, form, view, created, labels_per_task, tasks_per_assignment, active) VALUES ('کیفیت ویرایش نسخه ۲ (نمونه تصادفی ۲۰ هزارتایی،۲۰۱۵)', 'fawiki', 'damaging_and_goodfaith', 'DiffToPrevious', NOW(), 1, 50, True); INSERT 0 1
OK. Now time to load in the tasks.
cat fawiki.prelabeled_revisions.2.20k_2015.tsv | grep -P "rev_id|[0-9]+\tTrue" | /srv/wikilabels/venv/bin/wikilabels task_inserts 21 | psql -h wikilabels-database --user u_wikilabels u_wikilabels -W Password for user u_wikilabels: INSERT 0 3156