Grants:Programs/Wikimedia Research Fund/Images as data: Advancing WikiCommons for visual research

From Meta, a Wikimedia project coordination wiki
statusnot funded
Images as data: Advancing WikiCommons for visual research
start and end datesStart data: 1 June 2022 End date: 1 June 2023
budget (USD)40,000-50,000 USD
applicant(s)• Karin De Wild, Lozana Rossenova



Applicant's Wikimedia username. If one is not provided, then the applicant's name will be provided for community review.

Karin De Wild, Lozana Rossenova

Project title

Images as data: Advancing WikiCommons for visual research

Entity Receiving Funds

Provide the name of the individual or organization that would receive the funds.

TIB (Germany), Leiden University (The Netherlands), Iconclass (The Netherlands)

Research proposal[edit]


Description of the proposed project, including aims and approach. Be sure to clearly state the problem, why it is important, why previous approaches (if any) have been insufficient, and your methods to address it.

The use of digital methods to study art and images is increasingly important within a wide variety of research fields from art history and digital humanities to computer vision. Wiki Commons offers researchers the opportunity to access large quantities of images to answer innovative research questions by detecting patterns and trends. Our aim is to implement tools, so that the potential of this image database for art history and image studies can be further developed.

For this it is essential to use controlled vocabularies to describe what is depicted on the artworks and images. This will make it easier for researchers to find images, create datasets and detect patterns through comparisons, data analysis and/or AI applications. The most widely accepted controlled vocabulary for the description of subjects represented in images is Iconclass (CC0 license).

Therefore we propose the following research question: How to implement Iconclass to advance describing and indexing images within Wiki Commons?

This research project will test 3 approaches:

1. Annotations in Wiki Commons

We will map the most frequently used Iconclass notations to existing Wikidata Q-IDs and create new items when appropriate, using OpenRefine alongside manual curation and translation. This will allow us to then test and develop standard workflows for adding Iconclass annotations to images in Wiki Commons.

2. Bulk image upload with annotations

Iconclass is widely used by museums, art institutions and online databases around the world. This approach will offer the opportunity to (bulk) upload images that are already annotated with Iconclass concepts to Wiki Commons. We will develop an Iconclass reconciliation endpoint for OpenRefine and build on the work already done by the OpenRefine team to enable SDC upload functionalities.

3. Annotations via the Iconclass browser

We will develop an extension for the Iconclass Browser website which will allow users to upload images with added Iconclass annotations directly to WikiCommons. The Iconclass Browser is a search tool that supports users to find Iconclass concepts, understand their meanings and interrelationships, and see annotated examples. Describing images manually is time-consuming, therefore the ambition is to also develop auto-suggestions to assist users. A proof of concept will be released in Spring 2021.

Using participatory methods, we will engage Wikimedia user groups to extensively test and refine all approaches.


Approximate amount requested in USD.


Budget Description

Briefly describe what you expect to spend money on (specific budgets and details are not necessary at this time).

The budget covers salaries for 3 researchers, at 0.2 FTE for 1 year. This study takes an interdisciplinary approach, in which each researcher brings their own field of expertise. We will jointly carry out activities related to researching data modeling in Commons and Wikidata, user annotation workflows, technical development of tool extensions, community outreach and user testing, event organising and reporting. 10% of the budget will be allocated as contingency and/or travel expenses.


Address the impact and relevance to the Wikimedia projects, including the degree to which the research will address the 2030 Wikimedia Strategic Direction and/or support the work of Wikimedia user groups, affiliates, and developer communities. If your work relates to knowledge gaps, please directly relate it to the knowledge gaps taxonomy.

Our proposal addresses several areas on the Knowledge Gaps Taxonomy: The Structured data gap for Content (by enriching Commons and Wikidata with a structured vocabulary for image annotation); Language and Tech skills gaps for Contributors (by developing workflows that lower barriers to contributing structured data); Language and Information Depth for Readers (by enriching Commons with a vocabulary that is multilingual, and widely used for art historical study and analysis). Thus, we also contribute to the strategic goal of Wikimedia knowledge as a service. Multilingual, structured image annotations will improve discovery within Commons and enable sophisticated scholarly analysis, enhancing the experience of general users and scholars alike.


Plans for dissemination.

Fundamental to the research is to support arts and humanities researchers and museums to fully engage with Wiki Commons. The user-testing workshops will also provide a forum for knowledge exchange. Research findings will be presented at at least one Digital Humanities conference and published in Digital Humanities Quarterly. In addition, we will provide regular reports, documentation and tutorials via relevant Wikimedia channels (Meta pages, WikiProject pages and Community mailing lists).

Past Contributions[edit]

Prior contributions to related academic and/or research projects and/or the Wikimedia and free culture communities. If you do not have prior experience, please explain your planned contributions.

Lozana Rossenova (User:Loz.ross) contributes to data modelling in Wikidata and Wikibase. She is a member of the Steering Committee of OpenRefine and design researcher for the Wikimedia-funded extension to OpenRefine for SDC. She published Wikiversity instructions for OpenRefine, Wikidata and Wikibase.

Etienne Posthumus is the developer of the Iconclass Browser and Arkyves, Reference Tool for the History of Culture ( The latter is a rich database for thematic searches across cultural heritage collections.

Karin de Wild is a board member of Iconclass and she published widely on (analyzing) digital heritage collections, incl. the book ‘Museums and Digital Confidence’ (Routledge) and chapter ‘Digital Collections’ (Bloomsbury).

I agree to license the information I entered in this form excluding the pronouns, countries of residence, and email addresses under the terms of Creative Commons Attribution-ShareAlike 4.0. I understand that the decision to fund this Research Fund application, the application itself along with all the information entered by my in this form excluding the pronouns, country of residences, and email addresses of the personnel will be published on Wikimedia Foundation Funds pages on Meta-Wiki and will be made available to the public in perpetuity. To make the results of your research actionable and reusable by the Wikimedia volunteer communities, affiliates and Foundation, I agree that any output of my research will comply with the WMF Open Access Policy. I also confirm that I have read the privacy statement and agree to abide by the WMF Friendly Space Policy and Universal Code of Conduct.