Research:Curation workflows on Wikimedia Commons

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search
21:28, 7 September 2017 (UTC)
Duration:  2018-January — 2018-March

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.

The Structured Data for Commons project will fundamentally change how media metadata are entered, stored, and discovered on Wikimedia Commons.

This research project seeks to understand the current workflows of Commons contributors who curate media (categorize it, delete it, link to it from other projects, etc.) in order to identify opportunities to support these workflows better with new software or new metadata, and to avoid disruption of critical workflows during the transition to storing most/all existing metadata in WikiBase.


The primary goal of this project is to figure out how structured data-based functionality can support editors who are doing important work to:

A. improve Commons itself as a media repository

B. improve integration between Commons and other projects like WikiData or Wikipedia(s)

C. improve individual media files/pages, or collections of media files within Commons

In addition to figuring out how the structured data on Commons project can build new features to make these editors more effective or their work easier, we also—just as importantly—need to identify key workflows that we should avoid interfering with or breaking when we roll out new functionality.

What do we mean by curation?[edit]

At the start of the project, we are using the term 'curation' very generally. It refers to any Commons editing work that involves editing existing media, metadata, and tools (e.g. bots, gadgets). It's meant to exclude one major activity: uploading new media files and working with their metadata, which is addressed in a previous study. It also (tentatively) excludes building and maintaining some supporting content (e.g. process documentation, policies), or general communication and collaboration (e.g. Village Pump, working in Wikiprojects), unless those activities are shown to be directly relevant to our central curation focus.

We are very interested to know whether this definition aligns with community members' understanding of what 'curation' means in the context of Commons, and/or if there are other terms (or meaningful sub-divisions under the umbrella of 'curation work') that important to capture or interrogate more deeply.

Examples of curation workflows[edit]

  • Deletion requests and file deletion
  • Patrolling for unsuitable content (e.g. copyright violations)
  • Anti-vandalism patrolling
  • Categorizing media
  • Improving media metadata
  • Building and maintaining templates and categories
  • Building and maintaining curation tools


Semi-structured interviews with Commons editors who are involved in curation activities.


  • January 2018: scope project, identify potential interview participants
  • February 2018: develop interview protocol, begin interviews
  • March 2018: complete interviews and publish results

Policy, Ethics and Human Subjects Research[edit]

Interviews will be conducted and data stored in accordance with Wikimedia's guidelines for research consent, data access, and data retention.


Once your study completes, describe the results an their implications here. Don't forget to make status=complete above when you are done.

See also[edit]