Research talk:Measuring the effect of cross-linking missing articles in English Wikivoyage/Work log/2019-05-24

From Meta, a Wikimedia project coordination wiki

Friday, May 24, 2019[edit]

Today, I'm starting with getting a good dataset. I want to be able to compare article creation rates longitudinally -- by anons, registered editors (newcomers/experienced editors alike). So I've put together the following page creation query: https://quarry.wmflabs.org/query/36378

USE enwikivoyage_p;
SELECT rev_id, rev_timestamp, page_id, page_title, user_id, user_registration, ug_group FROM revision_userindex
INNER JOIN actor ON rev_actor = actor_id
INNER JOIN page ON rev_page = page_id
LEFT JOIN user ON actor_user = user_id
LEFT JOIN user_group ON ug_user = user_id AND ug_group = "bot"
WHERE rev_parent_id = 0 AND page_namespace = 0 AND rev_timestamp > "2016";

This gets all of the articles created after 2016. We should be able to use this to fit a timeseries. Then we can compare the forecasted article creation rate to the actual article creation rate and use that to look for evidence of a shift. --Halfak (WMF) (talk) 14:59, 24 May 2019 (UTC)[reply]