Research talk:Automated classification of edit types/Work log/2016-04-13

From Meta, a Wikimedia project coordination wiki

Wednesday, April 13, 2016[edit]

Loading in a new set of inter-rater reliability and a 5k full run.

u_wikilabels=> select id, wiki, name from campaign where active;
 id |     wiki     |                         name                          
----+--------------+-------------------------------------------------------
  4 | enwiki       | Edit quality (20k random sample, 2015)
  8 | azwiki       | Edit quality (20k random sample, 2015)
  5 | trwiki       | Değişiklik kalitesi (20,000 rastgele örnekleme, 2015)
  7 | ptwiki       | Qualidade das edições (amostra de 20k revisões, 2015)
  9 | frwiki       | Modifier la qualité (20k échantillon aléatoire, 2015)
 12 | eswiki       | Editar calidad (20k muestra aleatoria, 2015)
 14 | nlwiki       | Kwaliteit bewerken (20k steekproef, 2015)
 10 | ruwiki       | Качество правок (20-тыс. случайная выборка, 2015)
 11 | ukwiki       | Якість редагувань (вибірка випадкових 20 тис., 2015)
 15 | jawiki       | 編集品質( 20Kランダムサンプル)
 16 | dewiki       | Qualität Edit ( 20k Zufallsstichprobe )
 17 | etwiki       | Edit kvaliteet ( 20k juhuslik valim )
 18 | itwiki       | Qualità degli edit (campione casuale di 20k edit)
 19 | wikidatawiki | Edit quality (20k balanced sampled)
 13 | idwiki       | Kualitas suntingan (20k sampel acak, 2015)
 21 | fawiki       | کیفیت ویرایش نسخه ۲ (نمونه تصادفی ۲۰ هزارتایی،۲۰۱۵)
 23 | urwiki       | معیار ترمیم کریں ( 5K متوازن )
 24 | plwiki       | Edycja jakości (20k próba losowa, 2015)
 25 | hewiki       | איכות ערוכה ( 5k מאוזן )
 26 | viwiki       | Sửa chất lượng ( 5k cân bằng)
 27 | nowiki       | Edit kvalitet ( 5k balansert)
 28 | enwiki       | Edit type training (50 revisions)
 29 | itwiki       | Tipologia degli edit (100 edit)
 30 | arwiki       | عدل التصنيف في ال 20 ألف عينة
(24 rows)
u_wikilabels=> INSERT INTO campaign (name, wiki, form, view, created, labels_per_task, tasks_per_assignment, active) VALUES ('Edit type IRR (200 revisions)', 'enwiki', 'edit_type', 'DiffToPrevious', NOW(), 20, 10, True);
INSERT 0 1
u_wikilabels=> select id, wiki, name from campaign where active; id |     wiki     |                         name                          
----+--------------+-------------------------------------------------------
  4 | enwiki       | Edit quality (20k random sample, 2015)
  8 | azwiki       | Edit quality (20k random sample, 2015)
  5 | trwiki       | Değişiklik kalitesi (20,000 rastgele örnekleme, 2015)
  7 | ptwiki       | Qualidade das edições (amostra de 20k revisões, 2015)
  9 | frwiki       | Modifier la qualité (20k échantillon aléatoire, 2015)
 12 | eswiki       | Editar calidad (20k muestra aleatoria, 2015)
 14 | nlwiki       | Kwaliteit bewerken (20k steekproef, 2015)
 10 | ruwiki       | Качество правок (20-тыс. случайная выборка, 2015)
 11 | ukwiki       | Якість редагувань (вибірка випадкових 20 тис., 2015)
 15 | jawiki       | 編集品質( 20Kランダムサンプル)
 16 | dewiki       | Qualität Edit ( 20k Zufallsstichprobe )
 17 | etwiki       | Edit kvaliteet ( 20k juhuslik valim )
 18 | itwiki       | Qualità degli edit (campione casuale di 20k edit)
 19 | wikidatawiki | Edit quality (20k balanced sampled)
 13 | idwiki       | Kualitas suntingan (20k sampel acak, 2015)
 21 | fawiki       | کیفیت ویرایش نسخه ۲ (نمونه تصادفی ۲۰ هزارتایی،۲۰۱۵)
 23 | urwiki       | معیار ترمیم کریں ( 5K متوازن )
 24 | plwiki       | Edycja jakości (20k próba losowa, 2015)
 25 | hewiki       | איכות ערוכה ( 5k מאוזן )
 26 | viwiki       | Sửa chất lượng ( 5k cân bằng)
 27 | nowiki       | Edit kvalitet ( 5k balansert)
 28 | enwiki       | Edit type training (50 revisions)
 29 | itwiki       | Tipologia degli edit (100 edit)
 30 | arwiki       | عدل التصنيف في ال 20 ألف عينة
 31 | enwiki       | Edit type IRR (200 revisions)
$ cat ../datasets/enwiki.revision_sample.200_hand-picked.tsv | /srv/wikilabels/venv/bin/wikilabels task_inserts 31 | psql -h wikilabels-database --user u_wikilabels u_wikilabels -W 
Password for user u_wikilabels: 
INSERT 0 200

OK. Now to disable the old campaign.

u_wikilabels=> update campaign set active = False where id = 28;
UPDATE 1
u_wikilabels=> select id, wiki, name from campaign where active;
 id |     wiki     |                         name                          
----+--------------+-------------------------------------------------------
  4 | enwiki       | Edit quality (20k random sample, 2015)
  8 | azwiki       | Edit quality (20k random sample, 2015)
  5 | trwiki       | Değişiklik kalitesi (20,000 rastgele örnekleme, 2015)
  7 | ptwiki       | Qualidade das edições (amostra de 20k revisões, 2015)
  9 | frwiki       | Modifier la qualité (20k échantillon aléatoire, 2015)
 12 | eswiki       | Editar calidad (20k muestra aleatoria, 2015)
 14 | nlwiki       | Kwaliteit bewerken (20k steekproef, 2015)
 10 | ruwiki       | Качество правок (20-тыс. случайная выборка, 2015)
 11 | ukwiki       | Якість редагувань (вибірка випадкових 20 тис., 2015)
 15 | jawiki       | 編集品質( 20Kランダムサンプル)
 16 | dewiki       | Qualität Edit ( 20k Zufallsstichprobe )
 17 | etwiki       | Edit kvaliteet ( 20k juhuslik valim )
 18 | itwiki       | Qualità degli edit (campione casuale di 20k edit)
 19 | wikidatawiki | Edit quality (20k balanced sampled)
 13 | idwiki       | Kualitas suntingan (20k sampel acak, 2015)
 21 | fawiki       | کیفیت ویرایش نسخه ۲ (نمونه تصادفی ۲۰ هزارتایی،۲۰۱۵)
 23 | urwiki       | معیار ترمیم کریں ( 5K متوازن )
 24 | plwiki       | Edycja jakości (20k próba losowa, 2015)
 25 | hewiki       | איכות ערוכה ( 5k מאוזן )
 26 | viwiki       | Sửa chất lượng ( 5k cân bằng)
 27 | nowiki       | Edit kvalitet ( 5k balansert)
 29 | itwiki       | Tipologia degli edit (100 edit)
 30 | arwiki       | عدل التصنيف في ال 20 ألف عينة
 31 | enwiki       | Edit type IRR (200 revisions)
(24 rows)

OK. Ready. Now for the big one that'll be disabled from the start.

u_wikilabels=> INSERT INTO campaign (name, wiki, form, view, created, labels_per_task, tasks_per_assignment, active) VALUES ('Edit type (5k revisions)', 'enwiki', 'edit_type', 'DiffToPrevious', NOW(), 1, 10, False) RETURNING *;
INSERT 0 1

OK. Time to load it.

$ cat ../datasets/enwiki.revision_sample.5k_representative.tsv | /srv/wikilabels/venv/bin/wikilabels task_inserts 32 | psql -h wikilabels-database --user u_wikilabels u_wikilabels -W 
Password for user u_wikilabels: 
INSERT 0 5000

And that's it. We'll just swing back to enable this once we're ready. --Halfak (WMF) (talk) 15:44, 13 April 2016 (UTC)[reply]