Technical categories in Wikipedia

From Meta, a Wikimedia project coordination wiki

(old discussion)

It would be nice to have "technical" categories assigned to wikipedia articles, so they could be told apart by the software. This could help with numerous purposes. For example, if every biography article is marked as such, a list of all biographic entries in Wikipedia could be generated on-the-fly. Or, search queries could be limited to biographies only. Or, the search results could indicate "this is a biography". Or...

Notice : This is not about categories like "biology"! We had this discussion another time and decided against it.
sorry, I do not understand the difference --Nichtich 16:11, 12 Dec 2003 (UTC)

My (Magnus Manske) proposal is to use the "language link" mechanism, with a new "virtual" namespace. I would suggest "type:". So, a biography would contain "[[type:biography]]" in the text. This pseudo-link would not be visible in the text.

I would suggest [[schema:biography]] as the referenced object is in fact defining a schema of metadata for tagging the current object. See "Semantic web" under Field-value pairs -- Chato 24 Feb 2004

A preliminary list of categories (to be extended):

  • biography
  • lifeform (maybe divided into others):
    • animal
    • plant
  • geographic locations:
    • city
    • country
    • river
    • mountain
  • date (day/year?)

Since nobody is able to nor has the right to claim what is a category and what not, the categories should be also articles. It has to be discussed whether:

  • Every article can be used as a category
I think that indeed each article can potentially define a schema, I don't think there is a need to approve a schema; and in the future it could add more information, such as range (e.g.: biography/date-born is of type date, river/length is of type number, etc.); at the same time, we would need a way of extracting all instances of a given schema (e.g.: listing all pages that are a biography, etc.) -- Chato 24 Feb 2004
  • There is a special "category" namespace
    • connected to normal articles with the same name (like the Talk-namespace)
    • independet of normal articles (like the MediaWiki-namespace)

There could be overlapping/sub-categories. For example, w:Michelangelo could have both "[[type:biography]]" and "[[type:artist]]" in it, so the article would show up on a "biographies" and an "artists" list.


You could also add

  • field of study (like, er, biology...)

I'm not sure this is type: it's probably more like "instance-of".

Perhaps we should have the whole gamut of [[is-a:x]] [[has-a:y]] as per AI.

An idea -- should we be able to do this on ordinary links, so that we can add these annotations on ordinary links: the meta-stuff would not be displayed. Then we could add something like # at the front of a link to make it invisible. e.g.

[[is-a:artist]] displays as "artist"
[[#subfield-of:biology]] is not displayed.
  • The # idea is not needed if the software knows about the meaning of the prefix, because it can treat the link properly.
  • It is pointless otherwise, because we don't need hidden links that do nothing.

--Matusz