CIS-A2K/Konkani Wikipedia/Issues & Solutions

From Meta, a Wikimedia project coordination wiki
Main Workshops & Events Partnerships & Projects Reports & Blogs Media Coverage Issues & Solutions Discussion


This page is created to discuss the possible solutions for Konkani Wikipedia that faces a problem of multiple scripts. The community needs to reach consensus on how to deal with this issue and select an optimal solution. Based on this discussion CIS-A2K will alter it's Konkani Wikipedia Work Plan for the next quarter.

Issues faced by Konkani Wikipedia[edit]

Konkani is a language spoken primarily by people living in Goa and in the neighboring states on the western coast of India (also known as the Konkan belt) – some pockets in Maharashtra, Karnataka and Kerala. All of these neighboring states are places where speakers from Goa may have migrated over the past five centuries (Garry, 2001). Each region has a different dialect, pronunciation style, vocabulary, tone and sometimes, significant differences in grammar.

Multiple Scripts: Konkani is one of the few Indian languages, which is written in multiple scripts. These are Devanagari, Roman, Kannada, Malayalam and Perso-Arabic. The state of Goa uses Devanagari as the official script and thus Konkani written in Devanagari is widely used in education and administration.

Religious angle to the script usage: The multiple script issue ties in with some religious aspects as well. The Goan Hindus use the Devanagari script in their writings while the Goan Christians use the Roman script. The Saraswats of Karnataka use the Devanagari script in North Kanara district and the Kannada script in Udupi and South Kanara. Malayalam script is used in Kerala, but now there is a move to use the Devanagari script there.

Splintered Population: The total number of Konkani speakers seems to have remained remarkably stable for over a century. This is borne out by the census reports over the years. The Census Department of India, 2001 figures put the number of Konkani speakers in India as 2,489,015. Out of these, around 6 lakh were in Goa, 7 lakh in Karnataka, 3 lakh in Maharashtra, 6 lakh in Kerala and rest live outside of India, either as expatriates or citizens of other countries (NRIs).[4]

Other than these, the following issues often come up for discussion on Konkani Language.

  • Fragmentation of Konkani into various, sometimes mutually unintelligible dialects.
  • Strong bilingualism of Konkani Hindus in Goa and coastal Maharashtra with Marathi.
  • Inadequate opportunities and venues to formally study Konkani in schools and colleges, especially for the Konkani community that lives outside of Goa. This is seen as a major cause for the decline of Konkani language among the new generation speakers and expatriate Konkani community.

Suggesting Solutions to Multiple Script Issue[edit]

This section was documented/written on 8 January 2014

Similar problem was faced by Kashmiri Wikipedia which used Pasho, Sharada and Devanagari scripts. Punjabi has the Gurmukhi and Shahmukhi scripts, of which the former is used in India and the latter in Pakistan. Chinese language has two major writing systems; simplified and traditional Chinese. Other Wikipedias that have faced similar challenge are Uyghur, Azerbaijani, Korean Wikipedia etc. Now, lets look at how some of these language Wikipedias have tackled this in the past:

Possible Solution 1: Automatic conversion system.

  • A plug-in could be built into the server end of the language Wikipedia to automatically transliterate content from one script into another.
  • Other Wikis using this solution are Chinese, Serbian, Kazakh, Kurdish Wikipedias. To give an example, Automatic conversion system has been running successfully on Chinese Wikipedia since 2004 and has been well received by the community. In addition to Chinese Wikipedia, Chinese Wiktionary, Wikiquote, and Wikibooks also have the conversion systems.
  • CIS-A2K's stance for Konkani Wikipedia: Automatic transliteration from one script to another might not work for Konkani Wikipedia, as there are differences in dialect and also there is no ready tool available for converting one script into another (transliterating between Roman to Devanagari or Roman to Kannada script etc.)

Possible Solution 2: Partial automatic conversion system.

  • A plug-in could be used that can transliterate one script to at least another; out of all the writing systems used.
  • Other Wikis using this solution are Tajik, Uzbek, Gan Wikipedia. To give an example, Tajik Wikipedia currently has auto-converting system for two of the writing systems (Cyrillic - Latin) but not into Perso-Arabic.
  • CIS-A2K's stance for Konkani Wikipedia: This could be a possible solution for Konkani Wikipedia if the community decides that they’d like to have transliteration tools installed at least for the Indian scripts.

Possible Solution 3: Multiple writing system.

  • Have multiple articles in different scripts about the same topic. For example, have multiple articles about India in Konkani Wikipedia - one in Devanagari script, another in Roman script and yet another in Kannada script.
  • Some of the other wikis considering to adopt multiple writing system in the near future are Korean and Javanese Wikipedia.
  • CIS-A2K's stance for Konkani Wikipedia: This could be the short term solution for Konkani Wikipedia. It is something that is currently being used in Konkani Wikipedia in incubation.

Possible Solution 4: Create separate wikis for each script.

  • Create separate wikis for each script, at least those which prove to be active.
  • Separate wikis were created for Punjabi-Gurmukhi and Punjabi-Shahmukhi.
  • CIS-A2K's stance for Konkani Wikipedia: This could potentially be the long term solution for Konkani Wikipedia i.e. to have different Wikis for each active writing system - Konkani-Roman, Konkani-Devanagari and Konkani-Kannada. If there is an interested active community to create content for a particular script; we could push that to a new project in due course of time. As things stand, the Devanagari script has been active in the recent past, followed by Romi and Kannada in that order.