Ideas for the interaction with http://datahub.io (formerly CKAN):
- datahub contains descriptions of data-sets, and information how to get records from these data sets (downloads, APIs, etc)
- datahub could provide mappings between parts of these data sets, like columns in tables, etc, to wikidata properties
- this requires more fine grained information than is currently typical on datahub. For describing APIs, something like WSDL would be nice.
- the requirements of mapping to Wikidata would impose a common vocabulary and loose standard on the descriptions of data sets and APIs.
- this mapping could then be used to import records from arbitrary data sources directly into wikidata on demand
The possibility of seamless on-demand import from a large variety of data sources would allow us to avoid bulk imports into wikidata, if not desired by the community. Of course, if desired, bulk imports would still be possible using this kind of mapping.
Note: http://mes-semantics.com/ is planning to implement an import Wizard for Wikidata that could use the mapping info provided by datahub (and/or could share user provided mappings on datahub).
- Rufus Pollock of OKFN/CKAN/datahub
- Christian Becker of mes|semantics