Wiki-M3L: Wikipedia and Multi-Modal & Multi-Lingual Research

Co-located with ICLR 2022

This is the website for the Wiki-M3L: Wikipedia and Multi-Modal & Multi-Lingual Research workshop that will take place virtually on the 29th of April 2022, virtually, as part of the International Conference on Learning Representations (ICLR) 2022. Here you can find all important dates, the call for papers as well as the list of invited speakers and the program of the workshop.

Motivation

A space for the Wikipedia community and the multimodal & multilingual research community to share and support each other.

In the broader AI research community, Wikipedia data has been utilized as part of the training datasets for (multilingual) language models like BERT for many years. However, its content is still a largely untapped resource for vision and multimodal learning systems. We believe that Wikipedia, as a huge source of multimodal and multilingual human knowledge on almost every subject (English Wikipedia contains more than 5M images, with captions and structured related text, while there exists 35 languages with more than 100K images) could help widen the scope and improve accuracy and inclusiveness of multimodal models and their applications.

At the same time, the Wikimedia community of volunteers could greatly benefit from recent progresses in multimodal AI technologies, especially in cases where resources are scarce, such as low-resource languages. Most research using Wikipedia data does not currently take the needs of the community who creates that data into account.

With this workshop we are trying to create a bridge between the Wikipedia communities and the ICLR research communities. This would not only ensure the reuse of research findings, but also create an incentive for Wikipedians to work more closely with researchers for data creation. Researchers in the ICLR disciplines can find in the Wikimedia community a source of inspiration for new challenging tasks and research problems that can support free knowledge and its dissemination. Learning from these often unheard voices would allow researchers to contribute back to Wikipedia, by developing tools to help editors semi-automatically produce, curate, and modify content. Suggesting relevant images for articles in multiple languages, together with an appropriate caption is one such example among many.

The Wikipedia Image/Caption Matching Competition

Beside invited talks and panel discussions, our workshop will feature a session on the top solutions of the Wikipedia Image/Caption Matching Competition,1 a large-scale challenge on multilingual, multimodal image-text retrieval that is currently ongoing. Using the publicly available Wikipedia-based Image Text (WIT) dataset (Srinivasan et al., 2021), which contains 37 million image-text sets across 108 languages, we will be setting up a benchmark task with a disaggregated set of performance, fairness, and efficiency metrics.

More information and data can be found in the corresponding Wikimedia blog post and Kaggle page.

Important Dates

Paper submission deadline: ~~25 February 2022~~ 16 March 2022

Accept/Reject Notification Date: 25 March 2022

Workshop: April 29, 2022 (virtual)

Organizers

Miriam Redi (Wikimedia Foundation)
Diane Larlus (NAVER LABS Europe)
Krishna Srinivasan (Google Research)
Yannis Kalantidis (NAVER LABS Europe)
Tiziano Piccardi (EPFL)
Lucie-Aimée Kaffee (University of Copenhagen)
Stéphane Clinchant (NAVER LABS Europe)
Yacine Jernite (Hugging Face)