|The information we collect will be made freely available through Wikidata – a free and open database of structured data and one of Wikipedia's sister projects. Wikidata will enable users to integrate the information into Wikipedia in any of its 280+ languages. This can then act as a starting point for the world-spanning community of volunteers who are active on Wikipedia and Wikidata to further enrich the material.
By making the information more widely available we also believe that local communities, as well as the general public, can make use of the information in ways which allows them to both learn about their cultural heritage institutions and help with documenting them. Since the enriched information will be freely available to everyone, the resulting data can in turn be re-used, be it for tourism, as a complement to the official data or any other imaginable use.
The information will be clearly referenced so that it is obvious that it comes from official data, while at the same time clearly labeling any additional information which is added so as to make it clear that such info is not official nor is the agency in question responsible for it.
What sort of data are we looking for?
We are looking for datasets of cultural heritage institutions (museums, galleries, libraries and archives). For example, lists of museums in a certain country, state or other administrative unit.
How detailed is the data supposed to be?
The more detailed, of course, the better. But any data is better than nothing! Once it's uploaded to Wikidata, the data will become accessible to volunteers all around the world, who will then be able to edit and enrich it. There is a lot of interest in cultural heritage institutions on Wikimedia projects, meaning a high chance the data will be noticed, edited and re-used.
At the very least, we need this information to create a Wikidata item for a cultural institution:
- The name of the institution (in one or more languages).
- The type of the institution – is it a museum, a library, an archive?
- The location of the institution. It can be general, such as a state or province, but more detailed data (city, street address, coordinates…) is better.
Once we're moving beyond the basics, there's a lot of additional information that can be converted to Wikidata properties:
- Year of establishment
- Number of visitors in a particular year
- Collection size
- Who is the director of the institution
- Whom the institution is named after
- Official website
- Social media accounts
…and many more. For real examples, see Library of Congress (Q131454) or Hermitage Museum (Q132783).
What format is the data supposed to be in?
Since we will be working with datasets of hundreds or thousands of items, the data should be in a machine-readable format. As a rule, structured data is better than unstructured text, and open formats are better than proprietary ones. The data may be accessible either as downloadable files or via an API. Examples of good formats are csv/tsv and json.
On Wikidata:Open data publishing you can see on overview of different data formats and how they rate in terms of data openness.
Ideally, the dataset would also contain appropriate metadata, such as when and by whom it was created.
What about the copyright?
The data on Wikidata is available under the Creative Commons CC0 License. This license allows people to use the data without restrictions; no attribution is required. This is different from Wikipedia, which applies the Creative Commons Attribution license. CC0 is equivalent to Public Domain.
This means that in order for your data to be uploaded to Wikidata, it has to have been released under an open license. You can read more about copyright licensing on Wikidata on Wikidata:Licensing and about the benefits of open data publishing on Wikidata:Open data publishing.