Research:The Role of Images for Knowledge Understanding
In this project, we study the importance of images in knowledge understanding through Wikipedia. With a set of large-scale user studies, we will quantify and qualify how readers around the world rely on images when navigating knowledge in Wikipedia. The final goal of this project is to generate insights regarding how much readers from different demographics use images (intentionally or unintentionally) when reading Wikipedia, and to analyze the layouts and the types of images which are more useful for knowledge understanding in Wikipedia, across different article topics.
Wikimedia Research’s Knowledge Gaps white paper underlines the need for methods to quantify the global usage of multimedia content in Wikimedia spaces. As part of this strategic direction, we want to conduct research on the role of images in knowledge understanding and learning through Wikipedia. Understanding these aspects is crucial to enable future work in the area of machine learning for image search and classification. While both the Wikimedia Research team and the WMF organization are planning to deploy techniques to improve discoverability of images and pictorial representations of articles, we actually do not know the extent to which readers use images when navigating the site, and the role visual content plays in free knowledge diffusion around the world.
- Do images help readers learn or understand better what they are reading?
- For which kinds of tasks/articles are images more useful?
- For which cultures/countries are images useful, and does learning through images change if the reader is a native speaker of the language of the text?
- Which images are more/less useful for this task, and where do they appear in the text?
- And are images more/less useful for different demographics?
To answer these research questions, we will design a large-scale user study.
Task. In this study, participants will be asked to read a Wikipedia article, and then answer a set multiple-choice questions about its content. Questions could also be given to participants beforehands, in a “scavenger hunt”-type of experiment. After each article, an “accuracy” score reflecting how much the participant understood the article will be calculated based on the answers provided to the multiple-choice questions.
Interventions. Articles will be artificially manipulated to add/remove/displace images, and the effect of such interventions on the accuracy score will be recorded. Articles from different topics will be used, to understand how the importance of images varies across different areas of knowledge.
Platform. We will run the study on the Lab in the Wild platform. "Lab in the wild" seems like an ideal setting to test tasks on understanding/learning with images in Wikipedia for the following reasons: (1) It has an audience of diverse people speaking many different languages and coming from ~40 countries, and allows localization and translation to multiple languages; (2) Its experiments are "gamified" by nature: contributors are not paid, what they get as a reward is a feedback regarding a skill, and how they compare to others; (3) It is open (people don't have to sign up and register).
Policy, Ethics and Human Subjects Research
An umbrella IRB approval is available for LabintheWild studies that are of similar nature, collect similar demographics, and carry minimal risk to participants. For this study, we will have to get approval for the informed consent form that participants see at the start of the study as a modification to the umbrella IRB.