WikiIndaba conference 2021/Program/Wikimedia Lexicography: From Word Recordings to Lexemes and Beyond
|ID : lexicography||Wikimedia Lexicography: From Word Recordings to Lexemes and Beyond|
|Speakers : Mahir Morshed, Mohammed Sadat Abdulai, Tochi Precious, Mohammed Awal Alhassan, Abdul-Rasheed Yussif, Sadik Shahadu, Mohammed Mustapha Aliyu||Time block: Saturday 6 November 2021||Start : 12:15|
|Location : Room C||Duration : 85 minutes|
Since 2018, Wikidata has also stored a new type of data: words, phrases and sentences, in many languages, described in many languages. This information is stored in new types of entities, called Lexemes (L), Forms (F) and Senses (S). Lexicographical data will serve as the basis for Abstract Wikipedia's natural language generation capabilities. A few languages including Dagbani, Hausa and Igbo were selected to become focus languages for the development of this new project. This workshop session will walk participants through editing the Lexicographical data namespace in Dagbani, Hausa and Igbo languages. Also, participants will learn how to use the Spell4Wiki app to produce audio recordings of words in their language. We will walk them through the installation of the app, and how they can add their language, record audio files for Wiktionary and Wikimedia Commons. Towards the end, we will go into how these lexemes can be prepared for making text generation for the Abstract Wikipedia easier.
|Themes : Technology & Infrastructure|
|Tags : Abstract Wikipedia, Technology, Language, African Community|
|Notes : #WikiIndaba2021_lexicography|