Jump to content

Wiki List Tool

From Meta, a Wikimedia project coordination wiki

The Wiki List Tool is a Google spreadsheet that can use the Wikipedia/Wikidata/Commons API to evaluate lists of topics (usually names of people) to do quick reconciliation and checking for content on Wikidata and Commons. It is typically used to research a "work list" of people for edit-a-thons, and to help evaluate the quality of information the Wikimedia projects have. It can also be useful for photographers wanting to research which portrait photos are missing or need upgrading.

It was created by Andrew Lih in 2018 and has lived on in various forms, with a number of GLAM institutions using it for planning purposes.

For more advanced work: A more comprehensive reconciliation process can be done with OpenRefine, but requires more training and expertise. This tool is useful for up to 200 or so names, before the API limits are reached within Google Docs.

What[edit]

Given a name in the first column, the tool uses formulas and API calls to:

  • Determine whether the person's name matches an English Wikipedia article name (e.g. Marie Curie, or Mary Brown (author))
  • Report Wikipedia article quality using the ORES score of the latest version, using a color coded cell
  • Display the Wikidata Q number
  • Display the Wikidata description and occupation, to help verify the item is the right person
  • Display the Wikidata item's Commons image, if it exists
  • Display the page traffic for 30 days and 1 year time periods

Example[edit]

An example of a Google Sheet that uses Mediawiki API calls in cool ways. More to come.

  • Wiki List Tool - WikiAPA 2022 - Smithsonian Insitutiton work list for an edit-a-thon
  • To use it on your own, make a copy of the spreadsheet above, and replace the first column with your names

Why[edit]

By pasting in a list of names into the first column, a lot of information comes back from API calls without the user needing to understand how to write code and execute them. After the raw data comes back, a number of reports can be generated on different sheets with the larder spreadsheet.

  • Analaysis page – This shows the general article health with a histogram, and also shows which articles are the most popular by monthly traffic
  • Headshots page – This shows all the entries and a small thumbnail image of the image as stored in Wikidata (P18)
  • Headshots priority page – This page sorts the headshots by most popular articles (by traffic) to the least popular. This way the user can quickly scan which photos would have high impact if they are missing or bad quality.

How[edit]

It uses Google app script to collect information about Wikipedia article quality and completeness

The Google spreadsheet uses the excellent add-on Wikipedia and Wikidata Tools (github), in addition to some custom functions to work with ORES for article quality predictions via AI. The hope is that we can document these, industrialize them for broader use, and use them for other projects.

Among the functions:

 function WIKIREVID(article)             Given a Wikipedia article, get the latest revid, as this is used for ORES
 function WIKIORES(revid, targetwiki)    Given a revid, calculate the ORES score using the wp10, article quality rating
 function WIKIDATADESC(qid, opt_targetLanguages)  Given a Wikidata qid, return the description field

Users[edit]

If you use the tool, or variants of this, do let us know!

  • Smithsonian American Women's History Initiative - used for edit-a-thon planning and checking worklists of women biographies
    • Ada Lovelace edit-a-thon with National Air and Space Museum 2019 (Google Sheet)
  • WikiPortraits - used to research filmmakers or actors at film festivals, or authors a book festivals, to create a plan of action for taking portrait photos.