CIS-A2K/Events/Indic Wikisource Community Consultation 2018/Report

From Meta, a Wikimedia project coordination wiki


There was a long time required of Optical Character Recognition (OCR) for Indic language computing. There was not at per OCR available in Indic languages before 2015. Most of the Indic subdomain was created in 2007 to 2011, but due to not availability of OCR, the Indic Wikisource Community used to type the whole book or import the Unicoded text from other non-reliable sources. In 2015 the after Google Drive OCR released Indic community relief from the typing era.

Later Shrinivasan T developed an OCR4wikisource script to use the Google Drive OCR as Bot. Since the implementation of the OCR, there has been a lot of progress in Indic Wikisource. But we have realized the there should be a common platform where we can share our knowledge. Then one-month planning we have organized Indic Wikisource Community Consultation 2018. in Kolkata. this is first such consultation at this scale, convened by the CIS A2K team.

Indic Wikisource status of last 2 years Proofread Stats Graphics

The meeting had a representation of one volunteer from the Assamese, Bangla, English, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi, Telugu, and Sanskrit language Wikisource communities.


  • Ananth Subray (Kannada )
  • Bodhisattwa (Bengali)
  • Hrishikesh Sen (English )
  • Gurlal Maan (Punjabi )
  • Gitartha Bordoloi (Assamese )
  • Pooja Jadhav (Marathi )
  • Pankajmala Sarangi (Oriya )
  • Shubha (Sanskrit )
  • Sushant Savla (Gujurati )
  • Ranjith Siji (Malayalam )
  • Ajit Kumar Tiwari (Hindi )
  • Ramesam54 (Telugu )
  • Jayprakash (Indic Tech team)
  • Chinmayee Mishra (Oriya )
  • Tito Dutta,
  • Tanveer Hasan,
  • Subodh Kulkarni
  • Jayanta Nath,


We have started our discussion on day zero with the agenda of the main aims of this consultation and what all participants want from this program. The discussion was started at 6 PM and ended at 10 PM night. After discussion, we have summarized and set-up for two days agenda which was actually coming from the participants. The CIS-A2K team arranged for the travel and stay of all participants, as well as a night stay for all participants between the zero and second day, to ensure that the programme started on time on.

Day one started with Introduction of Wikisource by me were introduce the workflow of Wikisource, adding text, finding the source, basic copyright checking, creating Index pages, OCRed the page, Proofreading, layout with typography, Validation, Transclusion and Finishing touch. Later on, Hrishikes Sen demonstrated each segment broadly. Bodhisattwa (Bengali) demonstrated Wikisource Tool, like IA-UPLOAD, Vicuna Uploader, URL2COMMONS, Fill index Gadget etc. And all participants implement hands-on. Bodhisatta showed the Bengali Wikisource promotional videos.

Day two was started with Google Drive OCR without using Bot solution developed by Jayprakash (Indic Tech team). Later on OTRS process by Jayanta Nath, Wikisource Roadmap by Tanveer Hasan, Institutional Partnership - by Subodh Kulkarni and Transclusion in Wikisource by Susant Salva presented. The most achievements of this meeting were the second day, Jayprakash leads the task myself to clear the Wikisource technical backlog.

There were also some ideas coming up by the session by Tanveer. This included awareness, outreach, followups, and evaluation. A report about this meeting was published at Asomiya Pratidin. Some feedback from the participants can be found here.

Project goals[edit]

Indic Wikisource Community Consultation 2018 is a meeting Indic language community of Wikisource projects, the current state of affairs in supporting the growth, health, and motivation of Wikisource volunteers in India leaves much to be desired. Attempt to come to the agreement on a roadmap for a future where our resources are better utilized, our volunteers are better served, and progress on our mission is more steadily attained.

Lessons learned[edit]

  • What worked well?
    • The meeting and consultation brought together Indic Wikimedians in Kolkata.
    • Awareness about Wikisource and its importance.
    • Engaging activities for attendees.
    • Boost for skill Wikisource.
  • What did not work so well?
    • The meeting and consultation focused on Introducing Wikisource, its activities, guidelines and its workflow and share all experience gather, we took a session for making attendees understand how they can edit. Editathon for new editors was not conducted. We will do that in our future activities.
    • We faced a few issues in obtaining internet facilities.
    • We realized that some mechanism should be adopted next time to keep new editors engaged in activities.

  • What would you do differently next time?
    • We would like to organize Editathon and workshops to improve contents.
    • Next time we will try to organize Wikisource activities in schools/colleges. Next time we would like to take workshops/editathons/activities to bridge the gender gap in Wikisource.