Research:The Role of Images for Knowledge Understanding

From Meta, a Wikimedia project coordination wiki
12:04, 20 March 2020 (UTC)
Katharina Reinecke
Duration:  2020-April – 2021-
This page documents a completed research project.

In this project, we study the importance of images in knowledge understanding through Wikipedia. With a set of large-scale user studies, we will quantify and qualify how readers around the world rely on images when navigating knowledge in Wikipedia. The final goal of this project is to generate insights regarding how much readers use images (intentionally or unintentionally) for learning when reading Wikipedia, and to analyze the layouts and the types of images which are more useful for knowledge understanding in Wikipedia, across different article topics.


Wikimedia Research’s Knowledge Gaps white paper underlines the need for methods to quantify the global usage of multimedia content in Wikimedia spaces. As part of this strategic direction, we want to conduct research on the role of images in knowledge understanding and learning through Wikipedia. Understanding these aspects is crucial to enable future work in the area of machine learning for image search and classification. While both the Wikimedia Research team and the WMF organization are planning to deploy techniques to improve discoverability of images and pictorial representations of articles, we actually do not know the extent to which readers use images when navigating the site, and the role visual content plays in free knowledge diffusion around the world.

Research Questions[edit]

  • Do images help readers learn or understand better what they are reading?
  • Which images are more/less useful for this task, and where do they appear in the text?


To answer these research questions, we will design a large-scale user study.

Task. In this study, participants are be asked to read a Wikipedia article, and then answer a set multiple-choice questions about its content. After each article, an “accuracy” score reflecting how much the participant understood the article is calculated based on the answers provided to the multiple-choice questions.

Materials. We created a dataset of 470 reading comprehension questions that test the following aspects:

  1. Image Recognition Questions – to test how readers learn to distinguish a new image of a target concept (e.g., is this a Tiger lily ).
  2. Visual Knowledge Questions – to test how readers learn visual properties of the concept (e.g., what color are Tiger lily).
  3. General Knowledge Questions – to test if the reader can learn general knowledge about the concept (e.g., where are Tiger lily found).

Interventions. Articles are artificially manipulated to add/remove images, and the effect of such interventions on the accuracy score is be recorded. Articles from different topics are used.

Platform. We ran the study on the Lab in the Wild platform. "Lab in the wild" seems like an ideal setting to test tasks on understanding/learning with images in Wikipedia for the following reasons: (1) It has an audience of diverse people speaking many different languages and coming from ~40 countries, and allows localization and translation to multiple languages; (2) Its experiments are "gamified" by nature: contributors are not paid, what they get as a reward is a feedback regarding a skill, and how they compare to others; (3) It is open (people don't have to sign up and register).

Policy, Ethics and Human Subjects Research[edit]

An umbrella IRB approval is available for LabintheWild studies that are of similar nature, collect similar demographics, and carry minimal risk to participants. For this study, we will have to get approval for the informed consent form that participants see at the start of the study as a modification to the umbrella IRB.


Using data collected from the large-scale user study we have been able to identify unique factors that contribute to participant accuracy for specific question types. During the study participants were asked image recognition, visual knowledge, and general knowledge questions of the article content that they saw. With each article we randomly chose for them to see the respective Wikipedia article infographic image. Using R libraries we have ran mixed-effect models for each question type.

  • Images do not help with general knowledge questions. For this type of questions, how long participants took on viewing the article before answering questions along with their demographic information of education level and English proficiency made a significant impact on their accuracy. Upon manual inspection, we found that images help retaining some forms general knowledge, especially when the questions were about artistic styles or epochs of paintings or monuments.
  • Surprisingly, images normally did not help with visual knowledge questions. For these questions, booth article time and English proficiency had a significant impact. In our follow-up analysis, we found that, when images are consistent with their visual knowledge questions, answers are significantly more accurate than when images that are inconsistent with the question.
  • As expected, images made a significant difference for image recognition questions. For these types of questions, time and article time also came out to be significant.

Besides identifying what variables have an impact on participants' ability to retain certain article knowledge we were also able to contribute by creating the dataset of articles and questions used in this study.

On Phabricator[edit]

Follow this task.

Future Work and Research questions[edit]

  • For which kinds of tasks/articles are images more useful?
  • What type of images or visualizations [infographic, flowchart, picture, 3D images, moving images, graphs, etc] are useful in this context,i.e., the tasks/articles/field?
  • And are images more/less useful for different demographics?
  • What tasks or objectives do images and visualizations perform or achieve in the process of information dissemination, attention and depth of knowledge gained and total user/readers' experience?