Jump to content

Africa Growth Pilot/Online self-paced course/Module 2/Recording pronunciations

From Meta, a Wikimedia project coordination wiki

One more thing I want to demonstrate before we move to more specialized things is the possibility of contributing audio. So we talked about texts, we talked about photos and videos, but there is a whole other under-populated area of Commons, which is pronunciations, especially by the way of smaller languages.

It is extremely useful to collect systematically pronunciations of words in all the languages of the world. Why is that useful? Because pronunciation is tricky for learners, for non-native speakers, for translators. If I'm translating a book by, say, a South African author. And it mentions a name that is in the Xhosa language, which has a click that I don't know how to produce. And I want to somehow figure out how to transliterate this name into my native alphabet.

We don't have clicks, but somehow I have to figure out a way to write that in another writing system. If I have a recording of what that sounds like, that can really help, right? That was just a quick example, but actually we can use such recordings also for learning apps, things like Duolingo, if you know that app, or other kinds of apps that teach you a language; for games; for text-to-speech software that helps people with disabilities or just synthesizing spoken bits of text. These days, all kinds of generative AIs can use it as well.

And even more fundamentally, literally preserving the sounds of the language. As you know, some languages, and especially some dialects, are at risk of extinction. Some of them are only spoken by older people, etc. and if we don't get it recorded, we may not have access to what it sounded like, in 20 or 30 or 50 years.

So all of these are excellent reasons for systematically recording words in all the languages of the world. And the way we do that in Wikimedia is actually a very convenient workflow that involves generating a list of words using the Wikidata query service. You don't have to already know the Wikidata query service. There is an example linked here that you can use. And all you would need to do is change the language code. You take the list of words that this query gave you, and you feed that to a tool called Lingua Libre that is linked here.

Lingua Libre is a little site that runs in your browser and is like a recording studio. It shows you the word on the screen, and all you need to do is sit back and relax and literally just pronounce the word that you are shown. You are shown a word, and you as a native speaker of this language, just speak it. And then it shows you another word and another word and another word. And you can sit and record, say, 200 words in ten minutes easily. And the Lingua Libre software will save each word in a file and convert it to the right format and upload it on your behalf to Commons. Each such recording lives on the Wikimedia Commons and will link that recording to a Wikidata entity for that word. And I'm going to demonstrate what that looks like by going on Wikidata.

Don't worry if you don't know much about Wikidata yet, there are tutorials available if that's interesting for you. I myself recorded some tutorials on how to use Wikidata and how to use the Wikidata query service. And we can share links to them. Let's go to Wikidata and look at a word, the word "let", the English word "let". It is documented on Wikidata in something called a lexeme. A lexeme is an entity type on Wikidata that describes a lexical unit, a unit of language. And this entity page here has a lot of properties and values. This is an example, by the way, of structured data, and Wikidata knows that this is an irregular verb, and it has some usage examples for this particular word, the word "let", and it knows that it's derived from Middle English and all kinds of things like that.

But among other things it has pronunciation audio. Here, you see this? Pronunciation audio for this form of the word. I hope that was audible. That was a recording of an American woman saying the word "let". And it even says here that this is American English, right? Because it would sound different in South African English. So you can see that this is a file name. There's a file name here: 'en-us-let.ogg'. If we click on that, we are actually taken away from Wikidata onto the Wikimedia Commons, where you can see that this is indeed a sound file that's less than a second long. And it's just someone pronouncing this word.

But the the amazing thing is you don't have to go and record hundreds of tiny little files and then upload them one by one and specify the license and the date and everything I mentioned earlier. You don't have to do that one by one.

There is an automated workflow that I'm not going to demonstrate in full today, because I have already recorded a video tutorial demonstrating the whole thing for you, and it is linked here at the bottom of this slide. The bottom of this slide has a tutorial on how to record pronunciations.