Research:Investigating Wikipedia's role as a gateway to medical content

From Meta, a Wikimedia project coordination wiki
This page documents a completed research project.

Key Personnel[edit]

  • Lauren Maggio
  • Ryan Steinberg
  • Daniel Mietchen
  • John Willinsky
  • Joe Wass
  • Todd Leroux

Project Summary[edit]

This project is interested in understanding to what extent external links in Wikipedia entries serve as a gateway for readers to more advanced forms of learning and inquiry. More specifically, this project will investigate Wikipedia's gateway functionality in relation to its medical content as defined by WP:MED.


For this project we hold the following assumptions to be true:

  1. The placement of external links (e.g., journal article) in a Wikipedia article serves a number of purposes, including (a) indicating the article’s general verifiability, (b) validating specific information, (c) identifying source details and type, and (d) providing a gateway to source materials.
  2. Wikipedia can serve as a gateway to more advanced forms of learning and inquiry when readers “use” or click external links; this use may be associated with differences in the links’ source type and accessibility, placement, and composition; this use may also differ by WikiProject domain.
  3. The gateway effect is of particular relevance and value to readers with an interest in going more deeply into a topic, including those using Wikipedia within the scope of their professional practice (e.g., physicians, pharmacists, teachers).


In conceptualizing this project, we generated the following hypotheses:

Effect of article properties
H1: Usage of references by readers can be discriminated based on the topic of the article they are reading.

  • H1.1: Articles in WP:Med have more links per article than articles in Wikipedia as a whole.
  • H1.2: Readers click a greater proportion of the links (per time unit, reader, and article) in WP:Med articles than in Wikipedia as a whole.
  • H1.3: Readers click a greater proportion of the “reference” links (per time unit, reader, and article) in WP:Med articles than in Wikipedia as a whole.

H2: Readers tend to click references more on a WP: Med article when the overall quality of the article is low.

H3: Articles within the same topic that are spiking in popularity tend to see fewer clicks on external links than matching articles that are not trending.

H4: Readers tend to use WP:Med references more when an article is rated as “controversial”.

H5: Readers of WP:Med articles click references to patient-centered outcomes research more frequently than references that are not patient-centered outcomes research.

Effect of reader needs
H6: Usage of references is more strongly predicted by the readers’ information needs (e.g., quick fact lookup vs an assignment or intrinsic learning motivation) than by the topic of the article.

  • H6.1: Readers of WP:Med articles on medical conditions (which are articles that have Diagnosis and Treatment sections) click links more frequently in those Diagnosis and Treatment sections than in the other article sections.
  • H6.2: Readers of WP:Med articles on medical conditions click “references” links more frequently in Diagnosis and Treatment sections than in other article sections.

Effect of reference properties
H7: Perceived accessibility of a reference (e.g. a PDF link) predicts more strongly whether a link will be clicked than the category a source belongs to (scholarly vs non scholarly source).

H8: The context or position in which a link occurs (the reference section, an inline reference, the infobox, the article lead, the depth in the page) is a weaker predictor of click-through behavior than the topic of the article.


Upon receiving the data from Wikimedia, our multidisciplinary team will review and clean the data in preparation for analysis. Once cleaned, we will analyze the data using descriptive statistics to determine the extent of the use of reference links at the page level and within the larger data set.

Upon completing the analysis, we will consider our findings in light of existing research on reference use in summary resources and, if warranted, propose modifications to citation practices in Wikipedia on user interface design and instructional theory.


This work is now published as Meta-Research: Reader engagement with medical content on Wikipedia in eLife an open access journal under a CC BY license.

The code utilized to collect and analyze the data for this project is organized and made publicly available in a collated series of Jupyter notebooks at:

Wikimedia Policies, Ethics, and Human Subjects Protection[edit]

This research project has not been vetted by a university institutional review board, as it does not include human subjects.

Benefits for the Wikimedia community[edit]

We hope that our project will benefit the Wikimedia community in several ways. First, we are hopeful that our project will provide a measure of Wikipedia citation use that would serve as a baseline for future research and highlight the value of Wikipedia's references as a gateway to primary literature. Additionally, if warranted, we would hope that our findings would be valuable to informing educational initiatives to improve citation practices in Wikipedia. Related to the data generated, if allowed by Wikimedia, the sharing of this data would provide opportunities for other researchers to generate and explore queries related to citation use.


November 2018
Data collection

December 2018 - February 2019
Clean and begin analyzing data

February 2019
Begin writing a related manuscript

April 2019
Simultaneously, deposit a preprint of the article and submit a manuscript to a peer-reviewed, open access journal
Deposit all resulting data and code to an open repository
Post a summary of our findings and potential implications for Wikipedia citation practice to this wiki page

Throughout this timeframe we will post our progress to this wiki page.


This project is not supported by a grant or other form of institutional support


  • Maggio, L., Willinsky, J., Steinberg, R., Mietchen, D., Wass, J., Dong, T. (2017) Wikipedia as a gateway to biomedical research: The relative distribution and use of citations in the English Wikipedia. PLoS ONE 12(12): e0190046. Retrieved from:
  • Maggio, L., Steinberg, R., O’Brien, B.,Moorhead, L. & Willinsky, J. (2013). Access of primary and secondary literature by health personnel in an academic health center: implications for open access. Journal of the Medical Library Association, 101(3), 205. Retrieved from:
  • Moorhead, L., Holzmeyer, C., Maggio, L., Steinberg, R., & Willinsky, J. (2015). In an Age of Open Access to Research Policies: Physician and Public Health NGO Staff Research Use and Policy Awareness. PloS One,10(7), e0129708. Retrieved from:
  • Willinsky, J. (2008). Socrates Back on the Street: Wikipedia's Citing of the Stanford Encyclopedia of Philosophy. International Journal Of Communication, 2, 20. Retrieved from

See also[edit]