- Submission no. 054
- Title of the submission
- Wiki resources in semantic technologies: Tunisian experience
- Type of submission (lecture, panel, tutorial/workshop, roundtable discussion, lightning talk, birds of a feather discussion)
- Author of the submission
- Mohamed Ben Aouicha and Mohamed Ali Hadj Taieb
- Language of presentation
- E-mail address
- Country of origin
- Affiliation, if any (organisation, company etc.)
- Faculty of Sciences of Sfax, University of Sfax
- Personal homepage or blog
- Abstract (up to 300 words to describe your proposal)
- The Semantic Relatedness (SR) consists of quantifying any type of relationship between two concepts or two words. In this context, the quantification of this distance is based on the semantic information that can be extracted from huge corpora or structured resources such as the knowledge bases. These corpora can be used to extract the concepts or words that are co-occurring. In fact, these can express a certain relationship between them. This type of information will be very useful to determinate the SR between concepts.
Wikipedia is exploited frequently as a resource for extracting the co-occurrence between the different words. In our proposed approach, we are interested in filtering only words having the same part of speech as nouns, verbs, adverbs and adjectives from this encyclopedia. Our approaches are also enriched through the use of Wiktionary to determine words that are in these forms.
The process considers the filtering of the articles from Wikipedia and, then, to design and develop an application to render available a set of services offering statistics on co-occurring words. The first part provides a preliminary study including the presentation of the two exploited resources Wikipedia and Wiktionary. The second part is dedicated to the project design, which consists in presenting a collection of functional and technical needs towards the developed system.
- What will attendees take away from this session?
Attendees will learn that Wikipedia and sister projects are not just useful for scientific information seekers. In fact, it can be useful to promote natural language processing of languages supported by WMF wikis.
- Theme of presentation
- Wikimedia Research
- Technology, Interface & Infrastructure
- For workshops and discussions, what level is the intended audience?
- Length of session (if other than 25 minutes, specify how long)
- 30 minutes
- Will you attend WikiIndaba if your submission is not accepted?
- Not sure
- Slides or further information (optional)
- Special requests
- Is this Submission a Draft or Final? Final
If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).