From Meta, a Wikimedia project coordination wiki

Welcome back from Wikimania 2018!





Option 3: New Creation: What was one useful outcome that was created at the event for the Wikimedia movement?

I primarily worked on developing tools for Wikisource (with a focus on Bengali) during the hackathon. Most of the work was done with inputs from or following requirements given by User:Bodhisattwa.

  • British Library EAP scripts: The British Library Endangered Archives Program contains several PD books that can be uploaded to Commons. I had written a script to download books from the EAP last year at Wikimania, but this had stopped working due to the change in the way images were exposed by the website. I used input from User:Bodhisattwa and other members of the Bengali Wikisource community to write two scripts, both available at GitHub (described later). The scripts (in general) read from a set of input "collections" and scrape JPG files at the highest resolution it can, and finally stitches them to make a PDF.
    • is used to download individual books and upload them to Wikimedia Commons (example). A number of configuration options are available, including resolution, rotation etc. Note that this has two Bengali specific settings, lines 171, 172 and 177, although these are minor (page description / category related) and can be easily removed.
    • can be used to mass download books from EAP, given a set of collections. This is useful if the copyright status of the books is in question (and is much faster than individually downloading books and verifying). This has no specific settings and can probably be useful for other Indic language Wikisources given the (PD) content EAP has.
  • ASI co-ordinates: Since the West Bengal Wikimedians User Group decided to organize Wiki Loves Monuments 2018 in India, there were some tasks related to the list of monuments, such as updating their coordinates on Wikidata. Using the data from the geo-platform "Bhuvan" of ISRO, I wrote a script to scrape the data from a KML file available to a CSV. This needs some more work before the data can be ported to Wikidata, due to inconsistencies in numbering conventions, but should be possible with a little more work.
  • Bangla Academy downloader: Similar to the EAP mass downloader, but for Bangla Academy. Only works for files that already have a PDF on the website, will add a JPG -> PDF converter soon. This is available on Github.


I met a lot of people at Wikimania, and I'm only listing some of them here.

Anything else[edit]