Wikimedia Blog/Drafts/How the largest Unicode text based Odia library was born

From Meta, a Wikimedia project coordination wiki

POSTED ON JANUARY 14, 2015

Title ideas[edit]

  • Odia Wikisource digitizes classic books to create large Unicode text library
  • How the largest Unicode text based Odia library was born
  • Odia Wikisource organized its first public gathering after going live!
  • ...

Body[edit]

Group photo-KISS Bhubaneswar-2014December3
Group photo of the faculty and students of KISS Bhubaneswar who took part in the campus program to digitize books.
Group photo by Subhashish Panigrahi, freely licensed under CC-by-SA 4.0

In January 2013, some of the active Wikimedians from the Odia Wikipedia community submitted a request for the approval of the Odia Wikisource. The Odia language is one of the six Indian classical languages, and it is spoken by more than 40 million people worldwide. After two long years of persistent effort by the community, the project finally went live on October 20, 2014. This online library project aims to archive text from early literature and old books now out of print, with a license that allows reproduction, even for commercial use. Odia Wikisource surpassed all other conventional archives with its features: lightweight, completely text based and searchable — but accessible on computer and mobile devices. Texts from books are re-typed to make sure that they appear in search engines. With thousands of books printed so far in this language, Odia Wikisource opens up a whole new world to readers and book lovers.

The incubation

Like other new Wikimedia projects, "Odia Wikisource" was first created as an incubator project. No community existed to digitize books for it. Existing Odia Wikipedians doubled their time spent on the wiki to keep the project growing. For someone like Mrutyunjaya Kar, a veteran editor on many Wikimedia projects in four languages, it was never an easy job to devote so much of time balancing life and work.

Our language and literature are rich, and I think the Internet is the best place to open them to the entire world. Those who are in need of Odia books often don't get to accesss the books of their choice. The Odia Wikisource could be a platform for making the valuable texts available to people of all age groups.

— Mrutyunjaya Kar

Open Access To Oriya Books (OAOB), a book digitization project launched by Odisha based non-profit Srujanika in collaboration with National Institute of Technology, Rourkela, and literary organization Pragati Utkal Sangha, became even more valuable after Odia Wikisource took off. Currently, OAOB houses more than 200 books, a majority of which are in the Public Domain. A few of these books that were old and far from being put through OCR (Optical Character Recognition, a technique used to create text from images of typed or written text) were retyped in Unicode on Odia Wikisource.

The author has been privileged to be part of this great journey, which took a new shape with the beginning of relicensing copyrighted books under a Creative Commons Share-Alike (CC-by-SA) license initiated by the Centre for Internet and Society's Access To Knowledge program (CIS-A2K). To begin with, thirteen books from three authors in the first phase were relicensed under CC-BY-SA 4.0. Later 67 more books from seven different authors were relicensed under CC-by-SA 4.0 license. Needless to say, Mrutyunjaya played a significant role in acquiring permission from two of these authors. This has been the highest number of resources ever relicensed under a Creative Commons license to gear the open access movement in the Odia language.

The documentary "Odia: Silalekharu mobile", which captures views of notable authors, linguists, copyright holders on digitization of books and Odia Wikisource. The documentary was screened during "Odia Wikisource Sabha 2014", an e-publishing seminar at Bhubaneswar and Odia Wikisource workshop in New Delhi.
"Odia-Silalekharu mobile (Documentary)" Produced by Subhashish Panigrahi, freely licensed under CC-by-SA 3.0.



Digitizing the classic Odia book Bhagabata

Odia Bhagabata is one of the early writings that has reached millions of readers over the centuries with the beginning of Bhagabata Tungi culture in Odisha. Authored by Jagannatha Dasa in the 14th century, this twelve volume work has never before been available in Unicode on the Internet. Bhagabata has gone beyond being just a book, people even read the text to the ears of a dying person. A version typed in several legacy fonts was available on portal Odia.org, which came in handy while looking for a digital version. Many followed the digitization work for the book with an emotional call. Encoding converters were built and old converters were modified to cater to the needs of this voluminous work. After converting encoding, proofreading and formatting by at least eight new Wikisourcers, the classic work was digitized.

An "Odia Wikisource Handbook" for new contributors that gives brief idea about enabling typing in Odia, input methods and digitizing books on Odia Wikisource.
by Subhashish Panigrahi, freely licensed under CC-by-SA 4.0.
Odia Wikisource@campus, Kalinga Institute of Social Sciences, Bhubaneswar, Odisha, India

To engage with the students of Kalinga Institute of Social Sciences (KISS), an institution in the Indian state of Odisha's capital city, Bhubaneswar, and to enrich Wikimedia projects in South Asian languages, CIS-A2K signed a Memorandum of Understanding in January, 2014. This materialized when a 3 months long campus program was initiated in September. 9 faculty under a coordinator were trained about digitizing books on Odia Wikisource. Faculty then formed nine teams with four to five students from undergraduate and masters classes. Most of the students and some of the faculty had never typed in Odia before taking part in the program. Despite holidays and examinations, these nine teams digitized about four books by Odia author Dr Jagannath Mohanty. It is important to note that all the students speak in various aboriginal languages as their native tongues, and Odia is a link language for them, but as it is the official language of the state, they also learn Odia and are educated in Odia. Learning to type in Odia should be beneficial for them for job opportunities in, for instance, state government offices.

Contributing to Odia Wikisource was really helpful for us. This will also help us to document more about our own communities. Stories of our linguistic and cultural heritage has never been told to the world.

— Susanta Majhi, student and Wikisourcer, KISS

Public gathering "Odia Wikisource Sabha 2014"
Odia Wikimedians with invited guests during "Odia Wikisource Sabha 2014"
by Saroj Kumar Behera, freely licensed under CC-by-SA 4.0.

To educate more people about the Odia Wikisource project, the Odia Wikimedia community from Odisha organized a public gathering, "Odia Wikisource Sabha 2014", on November 28, 2014. Speaking during the event, poet and thinker Haraprasad Das suggested being selective in accepting books for relicensing and digitization rather than blanket move for accepting all the books. Das also emphasized creating a team of language experts for helping to curate, and having computers in every literary center to teach Odia typing and Wikipedia/Wikisource editing. "Being part of this historical moment of seeing so many aboriginals contributing to Odia language is my good luck," Das said. Soumya Ranjan Patnaik, founder and editor of Odia daily The Sambad, who joined as the chief speaker, announced a collaborative project for a competition among school students where they will be awarded based on their Odia Wikipedia article writing skills starting this new year. "Language should never be a barrier for anyone. Odia Wikisource is a democratic library — unlike the conventional libraries set up by the government," Patnaik told the audience.

Subhashish Panigrahi, Wikimedian, and Programme Officer, Access To Knowledge.

Summary[edit]

Volunteers from the Odia language Wikipedia joined forces to archive text from early literature and old books now out of print. Odia Wikisource is lightweight, completely text-based and searchable — and accessible on computer and mobile devices. With thousands of books printed so far in this language, Odia Wikisource opens up a whole new world to readers and book lovers.

Notes[edit]

Ideas for social media messages promoting the published post:

Twitter (@wikimedia/@wikipedia):


* Odia Wikisource digitizes classic books to create large Unicode text library (link)

* Odia Wikisource reaches masses celebrating over 50 Wikisourcers and over 100 books after going live in Oct'14
---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|------/

Facebook/Google+

  • Volunteers from the Odia language Wikipedia joined forces to archive text from early literature and old books now out of print. Odia Wikisource is lightweight, completely text-based and searchable — and accessible on computer and mobile devices. With thousands of books printed so far in this language, Odia Wikisource opens up a whole new world to readers and book lovers. (link)
  • More Odia language readers are slowly getting on the Internet and availability of content has always been a roadblock for them to read. Where Odia Wikipedia was the best place for accessing encyclopedic information Odia Wikisource has opened a wider way to read books. Now books from notable authors are being relicensed under Creative Commons licenses to make them available on Wikisource. Over 40 students from aboriginal communities learn Odia typing and digitize books during a campus program. The 3 months old project has reached to a wider audience after going live. -->