WikiProject Language samples
The Wikipedia exists in many languages, in which we have articles about languages. We already have a lot of high quality wikipedia-articles about languages in which you can find information about the number of speakers, the vocabulary, stem and grammar of the language. But often one very natural question about a language is still left unanswered: "How does this language sound?".
This project seeks to change this. The goal of this project is to add a small sample of text spoken by a native speaker to all the articles about languages. The first article of the Universal Declaration of Human Rights (UDHR) is an appropriate choice for this, since it is translated in "all" languages of the world, public domain and of the right length to be included to a Wikipedia article.
This is an example of how this could look and sound for the Japanese language:
The project benefited of a Librivox-project, which also had the goal of creating recordings of the Universal Declaration of Human Rights in over 50 languages. MichaelSchoenitzer imported those to commons, edited the files (see below) and extracted the first article. Through a project in the German Wikipedia they were included in the articles there.
Now it's time to make this into a international community-project – Wikipedians all around the globe can record the first article (or even the whole document) in their mother tongue, and the Wikipedia communities can add them to their language articles.
How-to create a recording
You want to read the first article of the UDHR in your mother tongue? Or you could convince some other person to do so? Awesome.
First: Get a microphone. Cheap Microphones have of course a lower sound quality, but if you follow the descriptions below even a cheap microphone will give reasonable results. If you live in a country with a local chapter you can ask there whether they can help you getting a microphone, for example in Germany you can borrow a microphone at Wikimedia Deutschland.
You can get the translation of the UDHR in your language at OHCHR.org. Find the first article and copy it to an editor or text program and format it in a way you can most comfortably read it. Before starting the recording read it two or three times loud and drink some water. If you misread a word simply read the word or group of words again and later cut out the wrong version.
Very important: when doing the recording make sure you also record at least 5 seconds of silence at the beginning or end of the recording – this is needed for editing. If you never did a recording, we recommend to use the free software Audacity. If you have a passive microphone (without power supply): activate the microphone boost and put the volume control to maximum. For an active microphone make sure the audio is not that high, that you reach the maximum gain when recording. After the recording, mark the part with silence you recoded and click on Effect -> Noise Reduction and click the Get Noise Profile button. After that select the whole recording (Edit > Select > All or the hotkey CTRL + A) and go again at Effect -> Noise Reduction and click the OK button. After that you can remove the silence and if there were any the misread sections by simply selecting them and pressing ⌦ Del. After that use from the Effect-Menu the filters Compressor, Leveller and Normalizer in this order. The default settings should be fine. When you are done, go on File -> Export audio, choose Ogg Vorbis as format and save the file.
Upload your recording to Wikimedia Commons, put it in the Category Audiorecordings of Article 1 of the Universal Declaration of Human Rights and add it on the listing below.
More tips for high-quality audio samples can be found in: A short guide to the recording of high-quality audio samples for Wiktionary
So far we have recordings of the following languages:
|Language||Full recording||Recording of Artikel 1||de-Wikipedia||your Wikipedia…|
|Chinese (which!?)||Done, 2 Versions||Done, 2 Versions|
|English||Done, 2 Versions||Done, 2 Versions|
|French||Done, 3 Versions||Done, 3 Versions||Done|
|Greek / Modern Greek ?||Done, 2 Versions||Done|
|Hebrew||Done, 2 Versions||Done, 2 Versions||Done|
|Indonesian||Done, 2 Versions||Done, 2 Versions||no|
|Italian||Done, 2 Versions||Done, 2 Versions||Done|
|Javanese (Semarang)||Done||ToDo||no article|
|Latin||Done, 2 Versions||ToDo||no|
|Dutch||Done, 2 Versions||Done, 2 Versions||Done|
|Polish||Done, 2 Versions||Done, 2 Versions||Done|
|Portuguese||Done (2 versionen)||Done (2 versionen)||Done|
|Romanian||Done very bad Quality||ToDo|
|Swedish||Done, 2 Versions||Done||Done|
|Sesotho / South Sotho||-||Done||Done|
|Spanish||Done, 2 Versions||Done, 2 Versions||Done|
|Tamil||Done, 2 Versions||Done, 2 Versions||no|
Open Questions and Tasks
- Should we also make recordings in different dialects?
- How do we link the audio-files on Wikidata?
- How do we reach native speakers of small languages?
- Design a logo for this project