Community Wishlist Survey 2021/Multimedia and Commons/Add wikitext description pages for Commons tabular data files

From Meta, a Wikimedia project coordination wiki

Add wikitext description pages for Commons tabular data files

  • Problem: Commons .tab and .map files cannot have categories, nor be described in wikitext, nor be described in structured data.
    Without descriptions or categories, the files aren't readily discoverable. It is not even currently possible to document what particular rows, or columns, or cells represent. Nor is it possible to describe properly where the data has come from, or any issues with it.
  • Who would benefit: The Commons tabular-data space is available to store tabular-data files up to 2Mb in size. Such files might represent eg the raw data shown on a graph (but in an reusable, examinable form), or important data series about the real world. But at the moment usability is crippled, because the files aren't describable or discoverable. As a result the potential is not used. Also sometimes people instead try to write large time-series into Wikidata items, for which they are utterly unsuited, causing difficulties and unfortunate item bloat on Wikidata. If our aim is making available the sum of human knowledge for every single human being to share, at the moment we are utterly failing to do that for tabular data.
  • Proposed solution: Attach a regular wikitext page to each tabular-data file, in the way we do for image files, to allow wikitext descriptions and categorisations of the files. Ideally also include a structured data slot, to allow the file metadata can be described and queried for using structured statements.
  • More comments: Ideally the description pages would act just like regular Commons pages. As a second-best it's also been suggested to add description pages as subpages (cf the '/doc' subpages used for templates), if that would be easier.
  • Phabricator tickets: T155290, T249896, T242596, T235332, T250919
  • Proposer: Jheald (talk) 19:22, 17 November 2020 (UTC)Reply[reply]

Discussion

  • Huh, I didn't realize these were in a separate "Data" namespace, rather than under "File". I realize data files don't display readily as images, is that the reason for the distinction? In any case, aside from display, I can't think of any good reason for treating data files so differently from image files, they should have most of the same metadata fields for example. ArthurPSmith (talk) 16:20, 18 November 2020 (UTC)Reply[reply]
  • "Nor is it possible to describe properly where the data has come from, or any issues with it." There is a description, license and a sourcing field in the format for each. I agree it isn't easy to use, but when people read the help pages (which they need to do anyways, to even begin to understand how to use these spaces), then these fields are documented. I'd say we should not confuse ability to document things with people's motivation/desire to actually do so in practice, which i find just as, if not even more likely to be the problem. —TheDJ (talkcontribs) 15:15, 2 December 2020 (UTC)Reply[reply]
  • I'd love for c:Category:Tabular data of COVID-19 cases to contain actual data tables instead of their talk pages. Even better if those data tables' action=view presentations could include visualizations via /doc pages, similar to what we're currently using c:Data talk:COVID-19 cases in Santa Clara County, California.tab and c:COVID-19 pandemic in the San Francisco Bay Area#Santa Clara County for. – Minh Nguyễn 💬 08:14, 9 December 2020 (UTC)Reply[reply]

Voting