Research:Oral Citations

From Meta, a Wikimedia project coordination wiki
This page documents a completed research project.

The Oral Citations Project is a strategic research project funded by a Wikimedia Foundation grant to help overcome a lack of published material in emerging languages on Wikipedia. It was undertaken by Wikimedia Foundation Advisory Board member Achal Prabhala as a short-term fellowship.


Imagine a world in which every single person on the planet is given free access to the sum of all human knowledge.

To many within the Wikimedia movement, this idea is the guiding ambition that drives us.

The problem with the sum of human knowledge, however, is that it is far greater than the sum of printed knowledge.

The idea behind this project, People are Knowledge, is a simple one. Wikipedia privileges printed knowledge (books, journals, magazines, newspapers and more) as authentic sources of citable material. This is understandably so, for a lot of time and care goes into producing this kind of printed material, and restricting citation sources makes the enterprise workable. But books – and printed words generally – are closely correlated to rich economies: Europe, North America, and a small section of Asia.

In India and South Africa, for instance (to take just two countries in the rest of the world), the number of books produced per year is nowhere close to, say, the number of books produced in the UK. What this means for indigenous language Wikipedias from India and South Africa is that there is very little citable, printed material to rely on in those languages; in turn, it means that it is very difficult for any of those languages to grow on Wikipedia. (There is a related problem: writing this local knowledge on English Wikipedia is a task similarly hampered by a lack of good printed sources).

It is undoubtedly true that the sum of human knowledge is greater than the sum of printed knowledge even in Europe, a continent with a tradition of printing books that stretches back 550 years. However, the sum of human knowledge is far greater than the sum of printed knowledge in societies like India and South Africa. There are significant media markets for Indian languages within and outside the country and yet there is little scholarly publishing in any language other than English. Most South African languages, with the exception of English and Afrikaans, have had a primarily oral existence, and a relatively recent – and nascent – publishing tradition.

Consider the total production of books in all languages from the UK, South Africa and India in 2005:

UK: 161,000 books / 60 million people
South Africa: 6100 books / 48 million people
India: 97,000 books / 1100 million people

If we were to measure books produced in 2005 per person per country, the comparison is more stark:

UK: 1 book per 372 people
South Africa: 1 book per 7869 people
India: 1 book per 11,371 people

(Sources: For the UK---figures aggregated from Nielsen BookScan and The Publishers' Association UK; For South Africa---from the Dept. of Arts & Culture report on Factors Affecting the Cost of Books in South Africa; For India---estimates made by the Federation of Indian Publishers. Population figures from various sources via Wikipedia.)

As a result of this disparity, everyday, common knowledge – things that are known, observed and performed by millions of people – cannot enter Wikipedia as units of fact because they haven't been written down in a reliably published source.This means that not only do small-language Wikipedias in countries like India and South Africa lose out on opportunities for growth, so also does the Wikimedia movement as a whole lose out on the potential expansion of scope in every language.


The project was planned with three Wikipedia languages in mind; Malayalam, Hindi and Sepedi (which is formally listed under the language category Northern Sotho/Sesotho sa Leboa).

These languages were chosen partly because of the interest of individual Wikipedians collaborating on the project, and partly to work through different situations with regards to each language. Hindi (with ~250 million speakers), Malayalam (~40 million speakers) and Sepedi/Northern Sotho (~5 million speakers), represented an opportunity to study three different language contexts on Wikipedia: one, an official language of India with enormous global media markets, two, a regional language with a strong diaspora and robust media market, and three, a regional language with a very small media market.

Collaborators were finalised: Shiju Alex, Mayur, Mohau Monaledi and Achal Prabhala, with additional help from Vijayakumar Blathur. We had decided early on to film sections of the project research, with Priya Sen directing the film and handling India-related camera work, and Zen Marie handling the film in South Africa.

In January 2011, research/travel schedules were drawn up, and in February 2011, project work began with a meeting in Delhi.


Delhi (India)[edit]

Participants at the first Hindi Wikipedia meetup

On February 12, 2011, Shiju Alex, User Mayur, Ravikant and Achal Prabhala met at the Centre for the Study of Developing Societies (CSDS) in Delhi. We used the meeting to hold the first ever Hindi Wikipedia meetup, which went off very successfully. We were able to discuss the project for several hours, as well as hold an in-depth interview with Ravikant, who has been working on and thinking about the Hindi language as part of his academic career for some time.

Urvashi Butalia in conversation
Ravikant, from CSDS
User Mayur, conducting an audio interview for Hindi Wikipedia

From April 20 to April 25, Mayur and Achal Prabhala worked together on creating potential oral citations for articles in Hindi around two games played in North India; Sur and Gullidanda(/Gillidanda). Mayur assembled sources he wanted to interview, both physically and on the phone. We also met and interviewed Urvashi Butalia, a pioneering feminist publisher and writer, whose book The Other Side of Silence constitutes a landmark in oral history in the Indian subcontinent. We were also able to spend a useful day at the CSDS library, discussing their collection of Hindi scholarly material, especially in comparison to their English collection – the objective being to see how Hindi scholarly publishing stacks up against English at a major academic institution in the heart of Hindi-speaking India.

One really interesting point that came up during the discussion with Urvashi Butalia, which you can see at 38:35 in the film, is her experience from feminist publishing, which is that often, women who held knowledge that was important didn't themselves think it was all that important. This provided an interesting angle to what we experienced ourselves, the lack of confidence – or uncertainty – among some people as to whether the orally transmitted knowledge they held was noteworthy.

Johannesburg (South Africa)[edit]

Isabel Hofmeyr, author of "We Spend Our Years As A Tale That Is Told"

Between February 26 and March 20, Achal Prabhala spent time in Johannesburg, primarily discussing the project with Mohau Monaledi. In the course of work, we were lucky to spend a day with Isabel Hofmeyr from the University of the Witwatersrand, whose book We Spend Our Years As A Tale That Is Told, published in 1994, continues to be a key text in understanding how an oral culture travels over time. At the University of the Witwatersrand, we spent time with Margaret Northey, Senior Librarian at the Cullen Library, where the university's 'Africana' holdings are kept.

Wits University
Nhlanhla Mabaso from the Wikimedia Foundation Advisory Board

Later in the month, we spent time with Nhlanhla Mabaso, from the senior management at the University of the Witwatersrand as well as the Wikimedia Foundation Advisory Board, and Charlene Foster, from the Wikimedia South Africa chapter, to gain further perspective on Wikipedia and the Wikimedia movement in South Africa. Jon Soske, a scholar of Indo-African history with a special interest in online archives through his work with the South African History Online project, provided several useful ways of understanding open access interviews as a radical form of scholarship.

Charlene Foster from the Wikimedia South Africa board
Jon Soske speaking on open access interview data

Isabel Hofmeyr made a very interesting point which you can see at 42:16 in the film. The manner in which oral culture was being recorded in print in South Africa in the 19th century – and thereby cleansed and changed as well – was analogous to an exactly similar movement in Europe. As the Wikipedia article on Little Red Riding Hood notes, prior to the Brothers Grimm version of the story – the version we know today, which was first recorded in print in the 19th century – the story of Red Riding Hood was considerably more bloody and bawdy. In that sense, the idea of oral culture as being something exclusively valuable to societies such as India and South Africa, with smaller publishing volume, is disproved: it would seem that there is a rich, global tradition of oral culture that is worth thinking about, not all of which has survived the advent of print culture.

This is also a key aspect of our project. While the lack of published material in the south is one symptom of the problem with citing knowledge to print, the solution – alternative means of citation, whatever they may be – would be poorer if also restricted solely to the south. Audio archives are a well-established tradition for capturing elements of our world not adequately documented in print. Outside of the institutional space too, there may be an oral culture worth nothing. Obviously, not everything noteworthy in the world (even in the Anglo-European world) is always available in published form.

Limpopo (South Africa)[edit]

Mohau Monaledi, Northern Sotho Wikipedia editor and board member of Wikimedia South Africa

In March 2011, we visited a group of people convened by Mohau Monaledi in Ga-Sebotlane in Limpopo, a small village deep in the rural heartland, and about a six-hour drive from Johannesburg. The purpose of the trip was to conduct audio interviews around potential articles that Mohau Monaledi was interested in assembling, on Morula, a country liquor made from fruit of the same name, and on two games played in the province, Kgati and Tshere-tshere.

Morula being made from fermented fruit
Kgati, a game that involves skipping through a long rope
Tshere-tshere, a game that is not unlike Hopscotch

The interviews were wide-ranging, and we were grateful for the excellent translation of Ashley Motlatjo Mabeba, a student of the University of the Witwatersrand School of Art, from Sepedi to English and vice versa. One of the more interesting things that happened at Ga-Sebotlane was that we were able to conduct enough audio interviews to record differences in perspective. For instance, when talking about one of the games (Tshere-tshere), one correspondent, Elizabeth Morokhu said very clearly that young people were not playing this game any more. (You can see her comments in the film at 40:38). Subsequently, another correspondent, Sandra Moremi, clarified that young people were indeed still playing the game, with slightly amended rules. (You can see these comments in the film at 41:03). This was very encouraging – as it was clear that the audio interviews we had assembled in Ga-Sebotlane spanned a range of opinions, thus promising a more comprehensive article that could accumulate real differences of opinion, much like the print sources would for a good article.

A note on the nomenclature employed: the language you will hear in most of the interviews conducted in Limpopo and Johannesburg is Sepedi. Sepedi is one of the languages that makes up a group of related languages that are classified under the group Northern Sotho, which is one of the official languages of South Africa. Within the languages of the Northern Sotho language group, the name of the group is Sesotho sa leboa. Wikipedia helpfully classifies Sesotho sa Leboa as an autoglottonym which is an awesome word that none of us previously knew the meaning of. In the course of this project, the terms Sepedi, Northern Sotho and Sesotho sa Leboa will be used, and they should all be treated as roughly interchangeable with each other.

Bangalore (India)[edit]

Nishant Shah, from the Centre for Internet and Society
The Centre for Internet and Society, Bangalore

Through May and June 2011, we spent time in Bangalore. We met with Nishant Shah at the Centre for Internet and Society. CIS has been a long time supporter of the Wikimedia movement and Wikipedian activity; the CIS office is where the fortnightly meetups in Bangalore are held, and it is also the legal address of the Wikimedia India chapter. (CIS also provided additional funding to this project, in order to be able to complete the film that came out of it). Nishant spoke from the perspective of cultural studies, and from recent experience compiling and editing CPOV: A Wikipedia Reader.

We had a very interesting interview with Geetha Narayanan, founder/director of the Srishti School of Art, Design and Technology. Geetha's simple and forceful point – "Coming from a culture where so little is written down, do we then say we know nothing?" – can be seen at 02:00 in the film.

Shiju Alex conducting an audio interview from his home
Geetha Narayanan, Srishti

Shiju Alex and Achal Prabhala also spent time together recording audio interviews in Kerala, from Bangalore, using basic Internet telephony.

Nishant's central point, which you can see at 09:51 in the film, is that Wikipedia doesn't necessarily take advantage of what he calls "internet objects", by which he means things that circulate within established systems of trust on the internet. This opens up the possibility that sources in general – across Wikipedia, and across Wikimedia projects – could perhaps be re-investigated, and a central question that might motivate this exercise would be:

What elements of the world's knowledge are we missing out on, given the status quo?

Kannur (India)[edit]

School children playing Dappa Kali in Kannur
Kannur Railway Station
Neeliyar Bhagavathi Theyyam
Shiju Alex recording an audio interview with the priest at the site where the Theyyam was performed
File:Vijaykumar Blathur – Alt.jpg
Vijaykumar Blathur, Malayalam Wikipedian from Kannur
At the Kannur University Central Library

In May 2011, Shiju Alex and Achal Prabhala travelled to Kannur in North Kerala to meet with Vijayakumar Blathur and others of his acquaintance who were interested in the oral citations project. Our time in Kannur proved to invaluable.

The first set of interviews we recorded (in Malayalam) were around a potential article on a Theyyam called Neeliyar Bhagavathi. This is an interesting temple ritual, for while the form itself is quite well-known and documented, in actuality it consists of hundreds of individual Theyyams, each of which has its own name, costume, performance and history. The particular Theyyam that we were interested in – Neeliyar Bhagavathi – is performed in a small ritual location in a forest near Kannur, and we were lucky to be present when one was about to be performed. In the course of the audio interviews, we met and interviewed an onlooker of the performance from the village, the priest at the site of the performance, and a scholar of folklore from the area.

Again, an interesting range of perspectives crept in. The priest suggested an elaborate story that is the basis for the location where the Theyyam is performed, while the folklore scholar suggested that in fact the myths around the location were just that – myth – and not connected to the Theyyam itself, whose words do not signify anything about the location at all. You can view these differences of opinion in the film at 34:20 and at 36:02.

We visited a schoolteacher who is passionate about traditional games, and runs an annual tournament of folk games. He arranged a group of students to show us how they play Dappa Kali, a game whose objective is to rearrange a set of broken tiles while escaping a ball that is being thrown around. (You can see Dappa Kali in action at 06:31 in the film). The ball used in the game, interestingly, is made entirely out of the leaves of coconut trees.

We also travelled to the central library of Kannur University which is located outside their campus, in the centre of town. The librarian in charge showed us the collection of Malayalam books (about 20% of the university's collection as a whole), echoing a pattern similarly observed in Delhi and Johannesburg, where scholarly books in native languages (be it Malayalam, Hindi or Sepedi) form a tiny fraction of a library's collection as a whole, the majority of the works being in English.

Audio & Images[edit]

Image and audio files captured under this exercise are archived on commons under the category Oral Citations.

The images uploaded to commons are screenshots of video footage, and are not always of the highest quality.

The audio files recorded are all interviews, and they are of three kinds.

The first category consists of interviews with people whose intellectual input we thought was valuable to the project.
The second category consists of interviews that were filmed on location, at site, and often accompanied by a demonstration of the subject itself – the bulk of these interviews are represented throughout the film. These are audio interviews that will eventually be used as citations.
The third category also consists of interviews which will be used as citations, but were conducted by the individual Wikipedians involved (Shiju Alex, Mayur, Mohau Monaledi) from their desks at home. The point of the third category of interviews is this: not everyone can afford to take time off and travel to far-flung locations to record audio interviews with people on-site, merely to use as a citation in a potential Wikipedia article. We were clear that it was important to be able to construct a credible audio interview with a reliable source over the phone, taking advantage of the widespread use of mobile phones in India and South Africa, and the relative ease by which a conversation can be recorded, either on a mobile phone, or through software if using Internet telephony.

Oral citations & multilingual transcripts[edit]

All audio files used as oral citations come with complete transcripts, both in the language recorded as well as translations in English for easy interoperability between language projects on Wikipedia (except for the interviews conducted in South Africa, as we had simultaneous translation in the audio between Sepedi and English, so the transcripts are only in English).

A full list of audio files used as citations, along with their transcripts is included below:





The film was primarily directed and edited by Priya Sen with additional assistance from Zen Marie. A final version of the film was completed in July, 2011.

(The English subtitle track for the film on Commons covers the full film, i.e. it extends to all speech, whether in English or not. This subtitle track is a distinct .srt file, and can therefore be edited and translated.)

(The English subtitle track for the film on Vimeo is partial, i.e. it extends only to non-English speech. The subtitles here are also burned in to the film, and therefore not editable.)

If you would like to download or use a DVD or higher resolution file, or want anything else that isn't listed here, please write to this address explaining what you need.

Meetings & Events[edit]

Please help: if you organised a screening of the film or a discussion of the project that isn't included here, just add it in.

July 2011[edit]

  • A film screening and discussion around the oral citations project was held at the 35th Bangalore Meetup in July, 2011.

August 2011[edit]

  • People are Knowledge was screened and discussed @ Wikimania 2011 in Haifa at 12 pm, Aug 5 (Friday) @ the Cinematheque. Here's the schedule. Watch the video of the presentation and Q&A in Haifa, courtesy Wikimedia Israel, here.
  • A film screening and discussion of authorities of knowledge @ SECT:REWIRED – a critical theory workshop sponsored by the University of California Humanities Research Institute – August 8, Monday, East-West Centre, University of Hawai'i, Manoa.
  • A discussion around the oral citations project was held at the 12th Mumbai Meetup on August 20, 2011.
  • The video on Oral Citations was shown and a discussion was held at the 16th WikiPuneri meetup (Pune, Maharashtra, India), especially bringing out the relevance of such citation methods to recording of rural heritage. A point emerged that Wikipedia would need to partially compromise on the principle of No Original research in innovative ways with due consensus. AshLin 07:24, 31 August 2011 (UTC)

September 2011[edit]

  • People are Knowledge was screened on Wednesday, 21 September at Social Media Week, Berlin. The event was hosted by MTV Networks and more information is available here.

Articles & Discussions[edit]


Discussion regarding Oral Citations in Malayalam Wikipedia

Articles created

Sesotho sa Leboa[edit]

Articles created


Discussion regarding Oral Citations in Hindi Wikipedia

Articles created


Discussion regarding Oral Citations in English Wikipedia

Articles created (based on the Hindi/Malayalam/Sepedi articles)

In the news[edit]

"People are Knowledge", the film that arose from the project has been viewed thousands of times on Vimeo and Commons, and there have been innumerable conversations on social media, thanks in large part to an article that appeared in the New York Times (below). Some of those social media conversations are archived here.

The oral citations project received vast attention from all corners of the globe. Here are some highlights of articles and discussions generated:

(Please help: if we missed something that ought to be here, just add it in.)

Print media[edit]

Not just the written word – Bangalore Mirror, July 24, 2011

Citing challenges for Wikipedia contributors – The Hindu, August 6, 2011

When Knowledge Isn't Written, Does It Still Count? – New York Times, August 7, 2011

Oral citations to be part of Wikipedia entries – The Economic Times/ The Times of India, August 25, 2011 (caution: misleading title)

Blogs & other Internet media[edit]

Signpost: Unsourced in India

CIS, India (a co-sponsor of the project): People are Knowledge

Slaw, Canada: Oral Citations, a Wikimedia project

Wall Street Journal, India

Gerard's blog: Wikimedia and original research

Dror's blog: First Wikimania report

Wikimedia Czech Republic: Zavede Wikipedie ústní citace?

ICT4D: Learning cycling and the persistent illusion that all knowledge can be accessed online

Black Studies Blog: When black poetry isn't written and published, does it still count?

Cawa, France: The flow of eliteracy

Wikilovesbieb, Netherlands: Mensen zijn kennis: Wikipedia experimenteert met orale bronnen

The Listener, New Zealand: Wikipedia and the Cool Problem

Bintulu, Indonesia: People are Knowledge

Publishing Perspectives Blog: When Google runs out of data to exploit, what's next?

Metafilter: Wikipedia Oral Citations

Peasant Muse: Wikipedia, Twitter and Mobility

IT Espresso, France: Wikipedia perd des contributeurs, mais se bat pour rebondir

Simply Coffee Blog: People are Knowledge

The Knowledge Lens (Northwestern U): Where is your knowledge?

Cottage Labs: Can knowledge extend beyond the written?

Proto-Knowledge: Changing the epistemology of Wikipedia

BackReaction: Was there a man on the moon? Are you sure?