Wikimedia Blog/Drafts/Vachana Sanchaya: 11th century Kannada literature to enrich WikiSource/kn

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

This was a draft for a blog post that has since been published at


Vachana Sanchaya: 11th century Kannada literature to enrich WikiSource


Vachana sahitya is a form of rhythmic writing in Kannada poetry that evolved in the 11th Century C.E. and flourished in the 12th century, as a part of the 'Lingayatha movement. More than 259 Vachanakaras (Vachana writers) have compiled over 11,000 vachanas. 21,000 of these verses that were published in a 15 volume "Samagra Vachana Samputa" by Government of Karnataka have been digitized. Two Wikimedians along with a Kannada linguist and author O L Nagabhushana Swamy are involved in the Unicode conversions, corrections and writing preface for these verses. The entire work is now available a standalone project called Vachana Sanchaya" and ready to enrich Kannada WikiSource.

Vachana Sanchaya Website Screenshot

This project was started a year ago when Kannada Wikimedian Omshivaprakash was trying to help Professor O.L. Naghabhushana Swamy and Kannada author and publisher Vasudhendra to easily access the vachana (verses) of Vachana Sanchaya. Swamy had challenges in using publicly available content on Vachanas since the data was in ASCII standard and searching text was a huge problem. I started helping in gathering information about about vachanas and document them into Unicode by writing scripts for few open source software. Further discussions raised requirement to get thousands of vachanas in the form of database so that it is easily searchable with an index. This demanded us to build a platform which would allow us to do all these activities which will help the linguistic researchers and students as well as public at large, who show interest in reading and studying Vachana literature. With this idea, Omshivaprakash started designing the model and his colleague Devaraju started building it. In the mean time I was running various scripts to fix errors in conversion of ASCII text to Unicode, confirming the data is ready to consume by the modules developed for concordance. We spent weekends & holidays executing this project from home and used to sync up once in a while. With the constant feedback and guidance from Mr. Swamy and Vasudendra, we learned how concordance of text is used by researchers and what pointers help them with the linguistic researchers and what would make it easier for them to research on Vachana Sahitya. Omshivaprakash worked on the architecture of the platform, decided the infrastructure requirements, Free and Open Source Software technologies to be used for keeping the platform active and managed the entire project, I was involved in providing critical hacks for digitization and valuable inputs through suggestions, feedback and QA.

ಕಾರ್ಯನಿರ್ವಹಿಸುತ್ತಿರುವ ವ್ಯವಸ್ಥೆ

At present, the system has around 200,000 of unique words on our repository. It was an extensive learning and food for thought during our free time to solve a real time issue and more over it was a work of Kannada language that needed a quick attention. Vachana Sanchaya is meant for research activity than just a repository of the text on web. While you search the words on our system, you can see who have used the word in which all Vachanas. To make the research easier for readability, we do highlight the text searched in each Vachana that would be displayed. To repeat the search for a specific Vachanakaara you just need to click on his name on the graph that we provide in the result page. We have used MediaWiki jquery-ime input tool architecture that helped us provide a feature to directly enter Kannada text in Unicode for search. So, type in directly and find results.

ಸಾರ್ವಜನಿಕ ಪ್ರತಿಕ್ರಿಯೆ

We are glad to see people accessing vachanas from our Facebook, Twitter and Google+ channels and there are thousands who read the same every day and it has become a part of their daily routine. There have been more than 50000 page views and 500,000 pageviews to our site in the first few months of our platform's public launch. Interestingly most commonly searched Kannada words like “ಕರ್ಮ"(Karma en:Work/Deed) , “ಸತ್ಯ" (Sathya -en:Truthfulness ) and “ನದಿ" (River) have resulted in quick and easy results.

ಆಂಗೀರಸ, ಪುಲಸ್ತ್ಯ, ಪುಲಹ, ಶಾಂತ,

ದಕ್ಷ, ವಸಿಷ್ಠ, ವಾಮದೇವ, ನವಬ್ರಹ್ಮ, ಕೌಶಿಕ, ಶೌನಕ, ಸ್ವಯಂಭು, ಸ್ವಾರೋಚಿಷ, ಉತ್ತಮ, ತಾಮಸ, ರೈವತ, ಚಾಕ್ಷಷ, ವೈವಸ್ವತ, ಸೂರ್ಯಸಾವರ್ಣಿ, ಚಂದ್ರಸಾವರ್ಣಿ, ಬ್ರಹ್ಮಸಾವರ್ಣಿ, ಇಂದ್ರ ಸಾವರ್ಣಿ ಇವರು ಇಪ್ಪತ್ತು ಮಂದಿ ಪ್ರಪಂಚ ನಿರ್ಮಾಣ ಸಹಾಯ[ದ]ವರು. ಹತ್ತೊಂಬತ್ತು ಎಂದರೆ ಪುಣ್ಯನದಿಗಳು. ಅದು ಎಂತೆಂದಡೆ: ಗ್ರಂಥ

ಭವಿಷ್ಯದ ಯೋಜನೆಗಳು[edit]

Our system is extensible with respect to adding new features, we have a review desk for researchers to help us with the review of content. Later we will also be adding required references to Vachanas from various research works that has been done around this literature. The content is available for the public through OpenData API and will be distributed in public domain through WikiSource once the review work is complete. This will open up the system for students, developers, researchers and anyone interested to work around building linguistic tools for Kannada and other Indic languages. This system would only evolve for other literature work than re-inventing the entire wheel for one more such project. Vachana Sahitya will further help us to initiate Natural Language Processing (NLP) projects if more researches get together to tag the words, glossary etc in coming days. We can also fulfill the need of various language tools like spell checker, grammar checker for users through crowd-sourcing the development. The forthcoming project under the “Kannada Sanchaya” are Sarvagnana Vachanagalu and Dāsa Sanchaya which are in the pipeline with initial phase of work. Our idea is to extend this platform from Vyasa to Muddanna and possibly the contemporary literature work available in public domain.

Plam leaf of 11th and 12th Century with Vachanas



by Pavithra Hanchagaiah and Omshivaprakash HI. Edited by Subhashish Panigrahi