Wikispeech/Wikispeech 2019

Wikispeech 2019

A tool to help people contribute with open speech data

Start:	2019-09-01
End:	2021-03-31
Team:	Wikimedia Sverige, STTS – speech technology services and KTH Royal Institute of Technology – Speech, Music and Hearing (part of School of Computer Science and Communication)
Management:	André Costa (Project Manager), John Andersson, Sebastian Berlin

In Wikispeech 2019 we will develop a tool that will let you contribute with speech data, to improve the Wikispeech TTS and other speech technology applications. The data may simply be recordings of your voice, which can be used to create new voices for TTS or improve ASR to be usable by a larger variety of people. You will also be able to enhance the recordings by adding further information which can be helpful in development or research.

Speech data is required by many speech technology applications and usually lots of it. While many commercial actors have their own data, it's so costly to collect that it's not viable to share. In the same vein, data is mostly collected for a few languages spoken in parts of the world where people can afford the end product. This means that many people never have access to this kind of resources, some of who may be the ones needing them most.

About

The tool will be web based and easy to use to allow as many people as possible to contribute. The easiest way to contribute will be through recording your own voice. Different type of speech is useful for different applications. This means that you can contribute even if you don't have special recording equipment or if a certain language isn't your native tongue. You will also be able to validate recordings by other contributors. This will help filter out recordings that are not suitable, such as spam.

The tool will also have support for enhancing recorded speech with more data, like phonetic transcriptions. This will make it useful for a wide range of applications. There will also be an option to generate manuscripts for recording prompts. This means that you can collect data for a certain task efficiently.

We have been in contact with Mozilla Common Voice and Lingua Libre and hope to find ways to cooperate with them during the project.

Media

Wikispeech presentation at Wikimania 2019.
Slides for the presentation on Wikispeech given at Wikimania 2019.