Jump to content

GLAMTLV2018/Submissions/OpenRefine workshop

From Meta, a Wikimedia project coordination wiki
Type of session

Workshop

Length of session

60 minutes

Ideal number of attendees

20-30

Etherpad

Abstract

A very hands-on introduction to the power tool OpenRefine, with a focus on data manipulation, Wikidata reconciliation, and Wikidata editing.

OpenRefine is a desktop application for advanced data manipulation. Since version 3.0 of the software, it also contains advanced Wikidata functionalities. This makes OpenRefine a very powerful piece of software for data import, for matching lists of concepts (for instance a list of names of people, places or concepts) with corresponding Wikidata Q numbers, and for batch edits on Wikidata. It does not require programming skills, but some experience with spreadsheets is helpful.

Outline

The workshop will try to cover as many of the following aspects of OpenRefine as possible, with an emphasis on Wikidata editing:

  • Getting started with the software
    • Install OpenRefine
    • Running it in your browser
  • Create or open a data project in OpenRefine
    • Projects, tags
    • Create a new project
    • The different formats you can import
  • Exploring data
    • Columns
    • Rows and records
    • Sorting
    • Facets
  • Modifying data
    • The cleanup and modification options in the menus
    • Basic splitting in columns
    • Some basic operations with GREL (General Refine Expression Language)
  • Did something wrong?
    • Project history and undoing steps
  • Wikidata magic!
    • Wikidata reconciliation
      • And extracting Q numbers with GREL
    • Drawing extra data from Wikidata ('data augmentation')
    • Batch edits to Wikidata with OpenRefine
      • Directly from OpenRefine
      • Or do an export to the QuickStatements format
What will attendees take away from this session?
  1. Participants will understand how OpenRefine works, what it can do and what not.
  2. Participants will be able to do basic data cleaning in OpenRefine, will know how Wikidata reconciliation works in the tool, and will know how to do Wikidata (bulk) edits with it.
Slides or further information

OpenRefine documentation specifically focused on Wikidata: https://www.wikidata.org/wiki/Wikidata:Tools/OpenRefine

Special requests

For this workshop, you need:

Participants can also bring data that they are interested in integrating in Wikidata (this can be a spreadsheet, data taken from an API, a .csv file...), but this is not mandatory. The workshop leads will also provide datasets to work with.

Workshop leads

Interested attendees

[edit]