How to move forward
- What was this session about?
How can GLAMs work with Wikidata and how such a data donation process could work
- What are the next steps to be taken?
Jens will keep the community up-to-date, early donors can serve as a model
- Who is the person to reach out to?
Jens Ohlig (WMDE) can help and support. You can also get in contact with Lydia Pintscher (WMDE).
see the Commons category
This session documentation was approved by the speaker.
- Original Description
- Data partnerships for Wikidata: How can we help institutions to get their data into Wikidata? What's the process from the initial contact to a script that uploads new pieces of data treasure?
- Session Format
- Desired Outcome
- Process is understood and can be applied by interested people in the movement
- Jens Ohlig (WMDE)
- Summary of the session
Jens started the session with a short introduction to Wikidata. He explained what items, properties, and statements are; most participants have used it especially for adding interwiki links between Wikipedia language versions.
Until now, Jens explained, a query service for Wikidata was the missing “puzzle piece”. The query service (query.wikidata.org) allows people to find related information, such as “largest cities in the world with a woman mayor” without actually going over the whole database of articles connecting knowledge of the world. The query service uses the SPARQL language.
One participant expresses their concerns that Wikidata’s data might not always be complete and/or correct. For the participant Wikidata is – at the moment – not an alternative for the scientific databases they are using so far. Someone else responded, that Wikidata was guaranteed incomplete, but it followed Wikipedia’s way. The audience agreed.
After having introduced the key elements of Wikidata, Jens comes to the questions why (GLAM) institutions should add data to Wikidata. He explained that the quantity of digitized content was growing, so the importance of adding meta to digitized content was growing as well. Jens mentioned the following points why donating data to Wikidata was good:
- It helps more people to see your (the GLAM’s) information
- It improves open knowledge
- It increases traffic to your (the GLAM’s) website
- It makes your (the GLAM’s) data more useful for yourself and others.
Jens explained that metadata is needed to build wonderful applications and get knowledge out of it. The connection between most different kinds of data sets in Wikidata created new knowledge, so data was only useful if you build something out of it.
Jens highlighted that Wikidata is a volunteer driven project. So for adding (masses of) data it was necessary, he said, to:
- contact Wikidata community describing what data you have and would like to include
- decide with community what data is suitable for import
- work with community to import data.
Jens also made clear that Wikidata shouldn’t be understood as a (data) dump yard, as the community wouldn’t sort out millions of data entries. So, GLAM’s (or their contacts in the Wikimedia organizations) shouldn’t simply upload the data, but also curate it. Jens also mentioned that in case someone had a great dataset, it would be possible to find a bot-coder helping with it.
After his presentation, the stage was opened for questions and discussions with the audience.
One participant asked what kind of format the data donations should have. Jens replied that people should ask for data in XML, XLS or CSV. However, he added, there are coders at the GLAM who help to find answers for questions right on the project and support your project. They know the right vocabulary for customers and colleagues. Any structured data format would do.
One participant was worried that important data to Wikidata could lead to a loss of information, as Wikidata was different to Wikipedia, where everything is more unsorted. The participant suggested having working group work with the institution to convert institution data (may be in a specific format) to Wikidata acceptable format.
Some participants discussed the value of data donations for GLAM institutions. One participant highlighted that the multi-linguism of Wikidata was a huge asset. If the data is improved enough, institutions can take them “back”. The Bundesarchiv donation for Wikimedia Commons is a good example. Many editors improved the metadata and the Bundesarchiv imported the corrected metadata to its own database.
In the end, Jens concluded the session and made clear that early adopters (“early donors”) could be good models for others.