Grants:Programs/Wikimedia Research Fund/Wikipedia as a tool for understanding contemporary science and the growth of knowledge

From Meta, a Wikimedia project coordination wiki
statusnot funded
Wikipedia as a tool for understanding contemporary science and the growth of knowledge
start and end datesMay 2022-December 2022
budget (USD)40,000-50,000 USD
applicant(s)• Omer Benjakob



Applicant's Wikimedia username. If one is not provided, then the applicant's name will be provided for community review.

Omer Benjakob

Project title

Wikipedia as a tool for understanding contemporary science and the growth of knowledge

Entity Receiving Funds

Provide the name of the individual or organization that would receive the funds.

Center for Research and Interdisciplinarity (CRI), INSERM U1284, Université de Paris

Research proposal[edit]


Description of the proposed project, including aims and approach. Be sure to clearly state the problem, why it is important, why previous approaches (if any) have been insufficient, and your methods to address it.

The scientific process lays out a method for the incremental accumulation of knowledge, with paradigmatic shifts occurring gradually. Meanwhile, Wikipedia is the key node of access to knowledge usually locked behind paywalls or jargon. Alongside its public role as a gateway to science, Wikipedia can also facilitate research into science and its history. Understanding how science matures and researcher translates into the public domain are key questions to addressing disinformation.

Lyam Wyatt has suggested that WP articles, their edit histories, talk pages, traffic and editors can serve future historians. An “endless palimpsest”, he argues its ever-changing text has the potential of serving as primary source in its own right. Some have, with limited success, attempted to harness it for this end, for example trying to use WP to map the history of knowledge since the dawn of man, using an algorithmic approach to prove the ideas of Thomas Kuhn.

This proposal suggests the creation of a rigorous method for using WP for historical research. Our research utilizes WP's edit history to try to understand, map and model the growth of contemporary scientific knowledge. The research asks: how do paradigmatic shifts in science manifest on Wikipedia, what are their different stages, and how do they spread outwards, reshaping other bodies of knowledge.

Our latest study - an examination of CRISPR, conducted in recent months as part of a research fellowship at the CRI research center in Paris, a research unit affiliated with the French National Institute of Health and Medical Research (INSERM) - has yielded a detailed overview of the field and its maturation. It has examined the growth of articles related to CRISPR, textual changes in them, migration of sections to new articles and bibliometric analysis of their sources. On its basis, alongside previous work, we claim, Wikipedia offers fertile ground for understanding the growth of science as well as cross pollination between different fields.

Our proposal suggests the creation of a number of additional studies and development of research tools to facilitate wide scale research of this type. Its findings will provide rich data into how scientific knowledge accumulates and translates to the general public. As such, its findings and processes could aid scientists, policy makers and even facilitate education - all while increasing access to and knowledge of Wikipedia to researchers, students and even citizen scientists.


Approximate amount requested in USD.


Budget Description

Briefly describe what you expect to spend money on (specific budgets and details are not necessary at this time).

The budget will focus on creating the infrastructure to achieve the aforementioned:
  • Research tools - development of a number of computational tools to support scaled research into knowledge growth (harnessing existing APIs and creating new ones)
  • A Wiki History Lab workshop bringing together experts from different domains (Wikipedians, historians, sociologists of knowledge, librarians/data scientists) for an intensive cross-discipline one-week workshop to create additional case studies


Address the impact and relevance to the Wikimedia projects, including the degree to which the research will address the 2030 Wikimedia Strategic Direction and/or support the work of Wikimedia user groups, affiliates, and developer communities. If your work relates to knowledge gaps, please directly relate it to the knowledge gaps taxonomy.

This project will impact multiple fields, as well as support at least three of the 2030 goals: 1) Evaluate, Iterate, and Adapt; 2) Innovate in Free Knowledge; 3) and Manage Internal Knowledge. As a research project it will create a large pool of academic case studies into Wikipedia while also facilitating access to and knowledge of it and science. This research project will allow students to both learn about Wikipedia, get exposed to its processes, and also create research that has merit for scientists, academics, educators and policy makers.

The educational aspect of this project will allow students and others to serve as citizen scientists, both learning about Wikipedia and creating research about knowledge growth on it


Plans for dissemination.

Our Wiki History Lab, hosted by the CRI, will bring together researchers as well as other experts and stakeholders. The CRI will also facilitate the educational and citizen science aspect of this project, and its students will be the first to test these tools in terms of using them to conduct research into Wikipedia. The students will work to utilize the tools and methods created by the lab, will thus serve the dual goal of creating research while also expanding access to Wikipedia.

Past Contributions[edit]

Prior contributions to related academic and/or research projects and/or the Wikimedia and free culture communities. If you do not have prior experience, please explain your planned contributions.

Over the past years, we've conducted research of this type. We have shown in a 2018 study published in the Journal of Biological Rhythms that Wikipedia’s articles managed to document the growth of knowledge related to circadian clocks, the field that won the 2017 Nobel Prize in Physiology, and even document the reformulation of its main paradigm, over 15 years. Our more recent study showed how Wikipedia’s articles on COVID-19 during the first wave managed to fend off disinformation by relying on top quality academic sources, so that articles’ reference list can be used for bibliometric analysis. The goal of this project is to consolidate these methods into a rigid and coherent system, making use of other research methods and existing tools

I agree to license the information I entered in this form excluding the pronouns, countries of residence, and email addresses under the terms of Creative Commons Attribution-ShareAlike 4.0. I understand that the decision to fund this Research Fund application, the application itself along with all the information entered by my in this form excluding the pronouns, country of residences, and email addresses of the personnel will be published on Wikimedia Foundation Funds pages on Meta-Wiki and will be made available to the public in perpetuity. To make the results of your research actionable and reusable by the Wikimedia volunteer communities, affiliates and Foundation, I agree that any output of my research will comply with the WMF Open Access Policy. I also confirm that I have read the privacy statement and agree to abide by the WMF Friendly Space Policy and Universal Code of Conduct.