Research:Supporting Commons contribution by GLAM institutions

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search
22:59, 5 July 2017 (UTC)
Duration:  2017-July — 2017-October

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.

Wikimedia Commons is the world’s largest free-licensed media repository, with over 40 million image, audio, and video files. Commons media are contributed by anyone, and curated by volunteers. The MediaWiki software platform that Commons is built on was designed to host text, not rich media. This creates challenges for everyone who uses Wikimedia Commons: media contributors, curators, and everyone who use Commons media on Wikimedia projects and beyond. For example, the unstructured nature of Commons makes it difficult to capture important metadata of media files, upload them, and search for the specific files you want to use.

The Wikimedia Foundation has launched a multi-year program to build a structured data layer into Commons, based on the WikiBase platform. This page describes the research component of that project, which seeks to understand the impact of structured data integration on one set of important Commons stakeholders: GLAM institutions.


GLAM (“Galleries, Libraries, Archives, and Museums”) is an established Wikimedia program that seeks to elicit contributions of free-licensed content from public and private institutions from around the world that curate cultural knowledge resources. One particularly valuable form of contribution that GLAM institutions make to Wikimedia Movement is by uploading media from their collections to Wikimedia Commons. Media include historical photography, video, and audio recordings, as well as images of artifacts, artwork, and other cultural goods.

GLAM institutions have communicated that the current tools available for uploading media (often in batches of thousands of files and hundreds of thousands of gigabites) are not well suited for their needs. For example, these media often have associated metadata—e.g. date of production, authorship, institutional source, and license information—that may be lost or mangled during the upload process. Curation is also a challenge: GLAM institutions often have a desire or an institutional mandate to track the usage of the files they donate. The lack of structure within the Commons repository and the lack of tools for tracking the usage of GLAM content within Commons make this difficult.

In subsequent phases of this research project, we will focus our investigation on another key stakeholder group for the Commons structured data program: volunteers who curate media on Wikimedia Commons. This research follows up on existing research done by Wikimedia Deutschland of heavy Commons users.


This project research will use semi-structured interviews with members of GLAM institutions and Wikimedia community members involved in the GLAM program to understand goals, motivations, and current workflows related to contribution and curation of GLAM content. Interview data will inform the development of secondary research artifacts (personas, scenarios, and user requirements) that will inform the design of tools and features to support structured data contributions by GLAM institutions to Wikimedia Commons.


  • July-August: develop interview protocol; conduct first set of interviews; present initial findings at Structured Data offsite (Montreal, August 15-16)
  • August-September: conduct second set of interviews
  • October: finalize personas, scenarios, and requirements; present findings

Policy, Ethics and Human Subjects Research[edit]

Interviews will be conducted and data stored in accordance with Wikimedia's guidelines for research consent, data access, and data retention.


Once your study completes, describe the results an their implications here. Don't forget to make status=complete above when you are done.