Jump to content

Celtic Knot Conference 2020/Submissions/FAIR linguistic data thanks to norm data – Wikidata as part of the research project VerbaAlpina

From Meta, a Wikimedia project coordination wiki
Quick links
Watch on Youtube
Submission no.
FAIR linguistic data thanks to norm data – Wikidata as part of the research project VerbaAlpina
Christina Mutter
Language of presentation
E-mail address
Country of origin
Munich University

The Digital Humanities project VerbaAlpina of Munich University (https://www.verba-alpina.gwi.uni-muenchen.de/) investigates the Alpine region in it`s cultural and linguistic unity. The project analyzes data that mainly derive from traditional linguistic atlases and dictionaries of the past hundred years. In addition to this historical linguistic material, current linguistic data are collected via crowdsourcing. In order that the project data are Findable, Accessible, Interoperable and Re-usable and thus comply with the so-called FAIR principles postulated by Wilkinson et al. as the guiding principles for scientific data management, the linguistic data are linked to norm data that enable targeted referencing to individual data.

In addition to norm data generated by VerbaAlpina itself, the project also links its data to identifiers of external institutions. Besides the GNDs (Gemeinsame Deutsche Normdatei 'Integrated Authority File') of the German National Library, GeoNames for localities, ISO Codes for languages and the identifiers of various reference dictionaries, this also includes the Q- and L-IDs of Wikidata. The Q-IDs are used to identify concepts and the L-IDs to identify so-called morpho-lexical types. By using these identifiers, it is possible to link the data from VerbaAlpina with external datasets. In this way, the data can be mutually enriched with additional information and unambigiously referenced. Compared to other norm data integrated in VerbaAlpina, the Q- and L-IDs of Wikidata also allow a language-independent linking of data.

This presentation will illustrate in more detail how VerbaAlpina generates norm data by itself and integrates norm data of other institutions, in particular Wikidata, in order to label linguistic data according to the FAIR principles. In particular we will talk about how the various norm data are linked to the linguistic attestations on the interactive map of the project.

What will attendees take away from this session?
Attendees will learn how and why Q- and L-IDs of Wikidata and other norm data is integrated in the research project VerbaAlpina of Munich University
Theme of session
Slides or further information (optional)


Watch on Youtube Discussions

Interested attendees[edit]

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).

Friendly space: Because we want to provide a great experience for all participants and foster collaboration, please keep these few guidelines in mind: let’s be respectful to each other, encourage participation and a positive atmosphere, be mindful of how our actions impact others, and feel free to ask for help at any time. Friendly Space Policy in full.