Kiwix/Wikipedia on demand

From Meta, a Wikimedia project coordination wiki

Project idea[edit]

Wikipedia can be a popular tool for teaching and is being used in many instances, including Switzerland. Some of these deployments are done offline so that unlimited internet access does not constitute a distraction to students. In other cases (particularly in prisons) the absence of connectivity simply is a pre-condition to access. In these cases, the default tool is Kiwix, the offline reader.

There can be, however, some reticence to giving access to the full Wikipedia within an educational context as much of its content is, not in fact, applicable to Education but often broaches other topics such as sports or pop culture. There is also the question of size: simply put, too much content can sometimes be unpractical to download and store when resources are limited.

Alternative distributions have emerged through time: Wikipedia for Schools, initially sponsored by SOS Children’s villages, has been popular and recently updated via a partnership with Arizona State University. It is also possible to download Medicine, Maths, Physics and Chemistry subsets that deliver content only pertaining to these topics. Generating these is however limited to Kiwix’ (or partners) willingness to engage in curation, which is both tedious and might not answer all needs appropriately.

We want to innovate and improve this concept so as to expand the choice of project-based selections. This novel approach would allow users to automatically generate bespoke Wikipedia article subsets that can cover any topic of their choice, in any language.

Tech stack[edit]

The Wikipedia on Demand tool service would basically use Wikidata to generate article lists -and article compilations- related to any given subject (provided it is available as Wikidata). One could therefore create a zim file containing all articles related to Climate change (Q125928) or the History of Switzerland (Q208761). Combinations could be endless, so much so that everyone could literally have their own bespoke Wikipedia selection to use for all kinds of educational or entertainment purposes.

The corresponding ZIM file would be readable via Kiwix, the offline reader. Part of the technology already exists as part of the WP1 evaluation tool, which is used by Wikimedia projects to automatically assess articles. This, in turn, helps them coordinate their efforts (see here for more info on the Wikimedia Foundation tech blog).

Impact[edit]

  • The project is aimed at Wikipedia editors by helping strengthen the user experience for the WP1 assessment bot. Not only would Wikiprojects be able to see which articles need to be worked on, but the new tool would also allow them to define new targets across or within projects (e.g. above History x Switzerland)
  • This tool would also allow other projects to use it for their own purpose: Education programs for instance (see, e.g. Wikipedia for schools), but also GLAMS who would be able to automatically generate article selections based on their collections. The Makumbusho App4Museums project from the World Heritage User Group aims at providing chapters and user groups with a new engagement tool for GLAMs that would easily demonstrate the value of participating in common projects by providing partner organisations with a bespoke Wikipedia selection that they could share with their visitors.

As every time something is made easier to access and more digestible, this should help increase the distribution and re-use of Wikipedia content in Educational settings.

Deliverables and Milestones[edit]

  • Article selection (Python, 12 days)
    • Web platform at https://wp1.openzim.org
    • Authentication system based on Wikimedia SLO
    • REST API
    • Operational running of WP1 instance
  • Zim generation (Python, 10 days)
    • Content editor
    • REST API
    • Automatically create dedicated recipes
    • S3 upload
    • Interconnection WP1/Zimfarm
    • Operational running Zimfarm instance to generate the .zim files
  • Deployment / Testing / QA (3 days)

Total 25 days @ CHF 500.- : 12’500

Project Management (20%) : CHF 2’500.-

Wikipedia on Demand estimated cost: CHF 15’000.-

All code is free and open-source (GFDL-3) and publicly available on a dedicated Github repository.

Deployment : around Q3/2022