Celtic Knot Conference 2020/Submissions/Search Support for Minority Languages
Jump to navigation Jump to search
- Search Support for Minority Languages
- Type of submission
- short presentation + Q&A session
- Author of the submission
- Trey Jones / TJones (WMF)
- Language of presentation
- E-mail address
- Country of origin
- Wikimedia Foundation Search Platform Team
- Personal homepage or blog
- My posts on Wikimedia Foundation blogs—new blog, old blog, older blog
- A brief overview of language-specific processing in search—tokenization, normalization, stemming, and stop words—using primarily English and Irish as examples, with additional examples from other languages that offer unique challenges—segmentation, transliteration, complex orthography, and a lack of software support—such as Chinese, Serbian, Khmer, and Mirandese.
- What will attendees take away from this session?
- Attendees will have a better understanding of the language-specific processing that goes into search, hopefully with an eye towards collaboration with the WMF Search Platform Team to improve search in the languages they use.
- Theme of session
- *Language technology
- Slides or further information
- Slides and notes are on Commons.
If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).
- —M@sssly✉ 13:41, 25 June 2020 (UTC)
- --Psubhashish (talk) 04:30, 3 July 2020 (UTC)
- ----Maria zaos (talk) 11:01, 10 July 2020 (UTC)
- VIGNERON * discut. 12:46, 10 July 2020 (UTC)
Friendly space: Because we want to provide a great experience for all participants and foster collaboration, please keep these few guidelines in mind: let’s be respectful to each other, encourage participation and a positive atmosphere, be mindful of how our actions impact others, and feel free to ask for help at any time. Friendly Space Policy in full.