Semantic MediaWiki/Related work

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

The aforementioned limitations of plain text data have been spotted in many wiki-related contexts and there are already various ideas that have potential to solve part of the problem. General methods of semantic annotation have been investigated in research on the Semantic Web, but here we will focus on proposals with a clear relationship to MediaWiki and other Wikis.

We give here a collection of all relevant approaches that we are aware of, classified according to their basic annotation mechanism. Whoever thinks that some other projects should be mentioned here is invited to add these.

Categorization of articles[edit]

The method of classifying articles by categories has already been introduced (quite successfully) into Wikipedia, but there is room for improvement. The full power of this data is not tapped yet: one can only list the categories, and not even search for articles that belong to subcategories or to multiple categories. In any case, categorization represents a huge step forward from the original practice of creating lists to collect articles on all kinds of topics (i.e. not just in cases where this is useful).

Annotating content with atomic data[edit]

"Atomic data" refers to information in the form of single data values with a distinguished data type. For example, one can annotate articles about cities with the population number of the city, or one can annotate persons with their dates of birth. The power of this approach derives from the customized treatment of different types of data. For example, one might search for cities with more than 1 million inhabitants, or for persons that were born on Friday the 13th.

Projects of this type are:

  • Wikidata: a general framework for annotating articles with data.
  • Personendaten: a project in the German Wikipedia to annotate articles about persons.
  • Image/multimedia annotation: a proposal to annotate non-textual data (eg, images), in order to obtain a searchable multimedia database.
  • Flexible Fields for MediaWiki: a proposed syntax and data model for adding field-value pairs, including nested collections of field-value pairs, in articles.

Typing of links[edit]

Links between articles contain a lot of information, which is easily understood by humans through the context in which a link appears. To make it machine-readable, it was proposed to tag links with predefined types, i.e. to introduce "categories for links".

For example, the article about Germany contains a link to Berlin. One can give a type like "has capital" to this link, providing an appropriate semantic classification. Details of this idea are described in the paper "Wikipedia and the Semantic Web – The missing links". This idea is not far away from well-known semantic technologies; the Meta-Wiki already contains another short article on semantic links with further related works.

Semantic Wikis[edit]

See w:semantic wiki for an up to date list of examples, theories and etc.'

Some Wiki-Systems other than MediaWiki have proposed to create semantic data by having authors enter it directly in some formal specification language next to normal text content. This allows users to specify almost any type of semantic information, in a way that clearly separates annotation from normal text-content. However, only those that are knowledgeable in a certain formal specification language can work with the annotations, so this solution might not be suitable in a community as broad and open as Wikipedia. However, these projects are important to as ways to probe possible applications, processing methods, and feasibility of semantic wikis.

Projects of this type include:

For an up-to-date list, and other useful information, visit the Semantic Wiki State Of The Art page on the Semantic Wiki Interest Group web site.

Querying data[edit]

The ultimate aim of semantic markup is to produce results to specific queries. For example, if a user wishes to list of the size of all articles about rivers in Buckinhamshire. One powerful existing solution to this kind of query is the 'Dynamic Page List' extension.

The Dynamic Page List extension would be described as a 'quick hack' by the Semantic MediaWiki community. However, the extension gives ordinary users easy access to powerful page reports.


Analysis of problems and opportunities[edit]

Aside from the above concrete ideas listed above, some people have contributed to the discussion by analysing shortcomings of the current approach or highlighting possible future applications of semantics in Wikipedia. Among these contributions are:

  • Ultimate Wiktionary: Can we combine Wiktionary content from all languages into a gigantic searchable database?
  • On the semantics of categories: What does it mean if an article belongs to a category? This resource discusses some typical cases one finds in practice.
Project Semantic MediaWiki.
This article is associated with the project Semantic MediaWiki.