WikiIndaba conference 2018/Submissions/Growing Wikipedia Across Languages via Recommendation Systems

From Meta, a Wikimedia project coordination wiki
Submission no. 005
Title of the submission

Growing Wikipedia Across Languages via Recommendation Systems

Type of submission (lecture, panel, tutorial/workshop, roundtable discussion, lightning talk, birds of a feather discussion)


Note: I find the most effective type of session for this kind of presentation as an interactive lecture where I can present for 20-25 min the state of the research and then open up the floor for discussions to learn how we can use and expand the current technologies and the state of research to help the specific needs of Wikipedia/Wikimedia in African languages and Africa related content in all languages.

Author of the submission

User:LZia (WMF)

This research is a joint work with Robert West (EPFL), Tiziano Piccardi (EPFL), Michele Catasta (Stanford University), Diego Saez-Trumper (WMF) and Jure Leskovec (Stanford University).

Language of presentation


E-mail address


Country of origin
United States
Affiliation, if any (organisation, company etc.)
Wikimedia Foundation
Personal homepage or blog
User:LZia (WMF)
Abstract (up to 300 words to describe your proposal)

Millions of articles are missing in Wikipedia across its more than 160 actively edited languages, and many of the articles that already exist have significant gaps of content. In English Wikipedia alone, only 1% of the more than 5 million articles have quality labels Good or better, 37% are stubs.

  • I present the state of research in identifying missing Wikipedia articles across languages. I show how we have used data mining approaches to find what articles are missing in a given Wikipedia languages, prioritized them, and built systems that would recommend such missing articles to editors that are interested to contribute to them. I present the result of the experiment that showed we can triple article creation rate in French Wikipedia using the existing editor community and without compromising on article quality.[1] I will present GapFinder [2], a tool that is designed to help interested editors find missing articles in their language based on their interests.
  • I will present the more recent state of this research which was partly motivated by our conversations with The Africa Destubathon organizers in 2016. In this second part, I will show you how we can use the content and structure of a given Wikipedia language as well as other Wikipedia languages to find missing sections in an already existing Wikipedia article and recommend them to editors for article expansion.[3]
  • Lastly, I will talk with you about what we know about the needs of the readers of Wikipedia from northern Africa and Sub-Saharan Africa (ongoing documentation), what we do not know and we should embark on learning, and reflect on some of the issues around bias in content and how we should overcome them on our way to knowledge equity.
What will attendees take away from this session?
  • Learn about the research and technology that can help editors improve content in African languages and/or about African content.
  • Learn more about Wikipedia readers from Africa and what we know about their needs.
  • At least one learning from the session that they can take and use in practice to (help) improve Wikipedia. :)
Theme of presentation
For workshops and discussions, what level is the intended audience?
Length of session (if other than 25 minutes, specify how long)
25 minutes

I would like to request a 55 minute session as knowledge gaps and how to work towards closing them is a big topic that can benefit from a more interactive presentation where both the audience and presenter can learn from each other. This is not intended to be a broadcasting presentation. :)

Will you attend WikiIndaba if your submission is not accepted?


Slides or further information (optional)

I provided these in the text of the abstract.

Special requests
Is this Submission a Draft or Final?


Interested attendees[edit]

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).

  1. Anthere (talk) 00:10, 16 January 2018 (UTC)Reply[reply]
  2. Lionel Scheepmans Contact French native speaker, sorry for my dysorthography 12:58, 31 January 2018 (UTC)Reply[reply]
  3. Blossom Ozurumba Talk 15:11, 5 February 2018 (UTC)Reply[reply]

Committee decision[edit]



  1. Wulczyn, Ellery; West, Robert; Zia, Leila; Leskovec, Jure (2016-04-11). "Growing Wikipedia Across Languages via Recommendation". arXiv:1604.03235 [cs]. 
  2. "GapFinder". Retrieved 2018-01-06. 
  3. "Research:Expanding Wikipedia articles across languages - Meta". Retrieved 2018-01-06.