CIS-A2K/Indic Languages/Numerals in Indic Languages & Indic language Wikipedias

From Meta, a Wikimedia project coordination wiki

As all of you know, most Indic languages use unique scripts for writing the language while some share a common script, such as Devanagari (for Hindi, Marathi, Sanskrit, Nepali and Bhojpuri).

A very interesting phenomenon in Indic languages is the usage of numerals. All Indic languages use its own script for representing the text, but the situation is very different when some one need to represent the numerals. Even though most Indic scripts has its own unique glyphs/symbols (see the following table) for representing numerals, many use Arabic numerals (or Indo-Arabic numerals) instead of language’s own numeral glyphs. (For those who do not know, the official name for the 0,1 2, 3…9 that we use in our daily lives is Arabic numerals! It has many names: Arabic numerals, West Arabic numerals, Hindu numerals, Indo-Arabic numerals, Hindu-Arabic numerals, to name a few :) , but many of us refer them as English or Roman or international numerals).

Here are the numerals of indic languages from the most popularly used numeral systems.


Arabic/Indo-Arabic/Hindu Arabic 0 1 2 3 4 5 6 7 8 9
Asomiya (Assamese)
Bengali
Devanagari
Gujarati
Gurmukhi
Kannada
Malayalam
Oriya
Tamil
Telugu
Urdu ۰ ۱ ۲ ۳ ۴ ۵ ۶ ۷ ۸ ۹

NOTE: Ancient Tamil was not having zero in its numeral system (just like roman numerals) and the numbers were represented similar to roman numerals. When printing started and when people started using place value notation zero is included in Tamil numerals.

Those of you familiar with any one of these languages will quickly realize that many of us do not use these numerals in our daily life – as Arabic numerals are now the norm in many languages. There are some exceptions though. More details about this is there in the next few sections.

Current status of numerals in Indic languages[edit]

Non-Devanagri languages[edit]

All Dravidian languages except Kannada (to some extent) have transitioned to Arabic numerals with media, printing industry and entertainment almost entirely on them. Even school text books have gone down this road. Tamil, Telugu, and Malayalam now use Arabic numerals almost exclusively for everything. The majority of the current generation of the speakers of these languages cannot even identify the numerals of their own mother language. The wikipedias in these languages (Tamil, Telugu, and Malayalam) also completely moved to Arabic numerals.

Kannada is an exception in this regard. Having lived in Bangalore for many years, I have seen Kannada numerals used for bus numbers and elsewhere. (It is actually thanks to this use of Kannada along with English in public space that I learned to read and write Kannada quickly (including numerals!)). Some Kannada text books also use Kannada numerals such as the 10th standard textbooks. (Incidentally, isn’t it wonderful that you can download these for free from an official website!) However, the situation is not so in media (print, online or TV) – where most use Arabic numerals.

So while Malayalam, Tamil & Telugu Wikipedias – use Arabic numerals, Kannada uses its own numerals. However, I have been observing that the new editors coming to Kannada wikipedia mostly tend to start off with Arabic numerals – as it takes a bit of time for them to realise that the preferred numerals on Kannada wikipedia are the Kannada ones. But it is good to note that most Kannadigas are familiar with Kannada numerals, which is not the case for other dravidian languages. In Kannada wikipedia a few community members are of the opinion that Arabic numerals should be used for articles related to science, mathematics, and technology.

Interestingly, Bengali and Assamese language speakers also user their own numerals every where, and Wikipedias also use them. I got the opportunity to see the usage of Assamese numerals in newspapers, books, and elsewhere when I visited Assam for the Assamese wikipedia workshops. Wikipedians from these 2 languages are making major efforts to make sure that all complex Wikimedia templates support their numerals also. This is even more laudable when one considers that the support for non-Arabic numerals is very less now when it comes to complex programs. The work they are doing will benefit all languages that use non-Arabic numerals.

The case is almost similar with Gujarati, Odia, and Punjabi languages where speakers use respective numerals in most places even though TV channels/news papers in some case use Arabic numerals.


Devanagari Languages[edit]

By Devanagri languages I mean the languages that use devanagri script. Some major languages that use Devanagari are Hindi, Marathi, Nepali, Sanskrit, Bhojpuri, and so on. The majority of the urban speakers of Devanagari languages, prefer Arabic numerals over Devanagari numerals when they want to represent the language in writing.This is widely prevalent in movies, newspapers, books, online, and so on.

Devanāgarī numerals


However, in the Wikimedia world, Marathi, Nepali and Sanskrit communities have decided to stick to Devanagari numerals. So, except for Hindi (which is using both the numerals simultaneously) , all other Wikipedias follow Devanagari numerals. In Hindi Wikipedia, the community uses both the numerals which is complicating the situation.

The Debate in Hindi[edit]

The situation in Hindi wiki community is bit complex. The community is divided over which to use – Devanagari or Arabic.

Some say that since we use Arabic numerals everywhere else, then this should be followed on Wikipedia too. They quote official communication from the Government of India which suggests that Arabic numerals should be used (though they refer to it as the “international form of Indian numerals”), and refer to a Government notification in this regard. They also talk about a proposal from the Government further reinforcing this. They also say that if South Indian languages can use Arabic numerals, then why can’t Hindi? There are also few Government decisions that went in favor of Arabic numerals (and Romanization of Hindi), such as

Apart from this, the Hindi film industry almost completely moved to Roman letters for most film publicity. For example, I can’t remember the last time I saw a Hindi movie poster in Devanagari. A search for Hindi film posters shows all posters only in Roman alphabets – and Hindi film credits are also now in English.

So the main argument is that most of people who have access to media and internet prefer Arabic over Devanagari – and so the former should be adopted as standard.

An Indian Railways bedsheet with devanagri numerals printed on it

The counterpoint by wikipedians who argue for Devanagari numerals is that – in spite of the official stance – neither the people, nor indeed Government authorities – have completely abandoned Devanagari, and it is still commonly used across many Hindi speaking areas of North India. For instance, official Government bodies like Indian Railways or Delhi Metro, and even some book publishers follow Devanagari numerals. According to them, unlike urban populations, the majority of the Hindi speaking rural population in UP and Bihar prefer Devanagri numerals when they want to write Hindi. They further point out that a language or script is not owned by the Government but by the speakers of that language. To that extent, they suggest that even if the Government has come up with an order that affects the growth of a language, it is the duty of the speakers of the language to stand and defend their language and script.

Moving forward[edit]

While both arguments are solid, there are unique complications that arise in the Wikimedia world due to the usage of both the numerals simultaneously. It is creating difficulties when both kinds of numerals are used on the same project like Hindi Wikipedia (for content, article titles, and so on) as this adversely affects hyperlinking as well as search. There are many other complications arising out of the simultaneous use of both the numerals everywhere. A decision based on community consensus is urgently needed to resolve what could potentially spiral into a much larger issue given that it is already a 1 lakh article project.

Bengali, Kannada, Assamese, Sanskrit and few other wikipedias are showing that it is perfectly fine to use own numerals every where. But few other indic languages like Tamil, Telugu, and Malayalam have gone with the Arabic numerals. So technically both options are possible in the wikimedia world. But community need to reach consensus and stick to one type of numeral.

Lot of discussion regarding this has happened between Hindi wikipedians both on wiki and off-wiki. The link to one of the on-wiki discussion is here.

http://hi.wikipedia.org/wiki/विकिपीडिया:देवनागरी_अंक

In the past, community was not been able to reach consensus – but it is important that the Hindi community should urgently agree on any one numeral system and move forward.


Shiju Alex
Consultant, Indic language Initiatives, India Programs of WMF