Research talk:Automated classification of article importance/Work log/2017-02-22

From Meta, a Wikimedia project coordination wiki

Thursday, February 23, 2017[edit]

Did some debugging of the category assessment script and implemented a check for whether pages exist so we're not submitting invalid revision IDs to ORES. WikiProject Medicine now has a page with Start-class articles that are good candidates for reassessment, looking forward to learning their experiences with it.

Created a Quarry query for the number of articles (technically talk pages) in each of the importance-categories. Here are the results, and it looks like we'll be processing a few million pages, but since it's all based on page IDs it should be straightforward to batch-process them similarly as we do for SuggestBot's link queries. I'm also cautiously optimistic that a dataset of pages with unanimous importance ratings from multiple WikiProjects will be sufficiently large given the results of my Quarry query.

I'm not satisfied with the definition of reputation/authoritativeness in the literature review, and have started digging through some information science literature to see if they provide some solid definitions. Did not find anything conclusive yet. Might have to also pull in the Stanford Encyclopedia of Philosophy with regards to authority in the political sense, as that seems relevant here.