Semantic MediaWiki/Implementation/Representation in database and RDF

From Meta, a Wikimedia project coordination wiki

Database Representation[edit]

The blueprint called for storing relations and attributes in an RDF triplestore. This poses integration problems with MediaWiki servers, so SMW stores the information in conventional SQL tables while providing import and export via RDF.

The database format is undergoing change

To see the tables created by SMW, you can browse the wikidb database on a running implementation and inspect the smw_ tables. The SQL code that creates the SMW tables is in function SMW_CreateSemanticTables() in the file includes/SMW_Storage.php file, you can view its source in CVS.

smw_relation[edit]

smw_relation stores relations between articles. It stores the subject page's namespace + name, relation (e.g. "is capital of"), and object page's namespace + name.

smw_attributes[edit]

smw_attributes stores attributes values. It stores the subject page's namespace + nameattribute, the datatype (e.g. "geoarea"), value (e.g. "963.6"), and unit (e.g. "km&sup2).

In SMW 0.3, all values are stored as strings. If the code for the datatype is able to convert units, then the value stored is in the standard unit for the datatype, regardless of the units entered by the page author.

smw_specialprops[edit]

smw_specialprops stores certain special properties of pages, including:

  • property 1: has Type, linking Attribute pages to a datatype.
  • property 4: category, linking regular pages to categories

RDF[edit]

The semantic information that SMW adds to MediaWiki is well represented as RDF triples. The relations to other pages and attribute values pertain to the current page, so the article is the subject. The type of relation or attribute name is the property. The destination of the relation link, or the value of the attribute, is the value.

Note that you need to distinguish between statements about the page from statements about the page's subject. The San Diego page does not have a population, the subject of the page has a population. But the page does have a last author, etc.

RDF Export[edit]

The infobox in each page has an Export as RDF hyperlink. This links to a special page that does the export.

The RDF export has to re-encode MediaWiki URL's to make them URI's that are valid in RDF.

To distinguish the article from the subject of the article, the latter has an underscore prefixed to the article title, e.g.

<smw:Thing rdf:about="http://wiki.ontoworld.org/index.php/_San_Diego">


Ontology Import[edit]

(Experimental feature in latest CVS.)

Futures[edit]

Specifying additional semantic information[edit]

The RDF Export from SMW uses some rdfs and owl properties to represent the relations and attribute values in SMW. Note that you cannot add additional semantic properties to it by putting rdfs and owl relations into Relation: pages; these are turned into SMW <relation:> properties in RDF Export.

In e-mail to semediawiki-user on 2006-04-24, Markus Krötzsch commented

We do not really want literal owl-statements in the wiki (especially since we have to provide internationalized versions anyway).
The idea is that you can say "[[equivalent URI:=http://whatever.you.like]]" to get owl:sameas in the RDF export. For classes, you get owl:equivalentclass and for properties owl:equivalentproperty. This feature is not yet intended for large public wikis, since it allows people to mess up the RDF export with very local edits (e.g. on the page of "Relation:Is capital of" one could state "[[equivalent URI:=http://www.w3.org/2000/01/rdf-schema#subClassOf]]")
The function is intended to provide a way of mapping imported ontologies to their original URIs, not for substituting internal relations for external URIs in the export. It might be switched off in some wikis.
We have a mechanism of adding special relations/attributes with internationalized names, and we will support more language features of owl and other specs in the future. E.g. there already is a relation "is subrelation of" which is recognized as a special property (in English wikis). But it is not considered during RDF export yet. Also, it will be ignored on pages that are not exported as owl properties.
Our general idea is to allow wiki hosters to enable or disable certain expressive features selectively. Instead of having one mechanism which allows you to substitute any internal URI by an external URI, we support some language features explicitly (domain and range might be further candidates). It will then be possible to switch off anything that is considered inappropriate on some site.

Up to Implementation TOC previous: Page Editing and Namespaces next: Code

Project Semantic MediaWiki.
Project Semantic MediaWiki.
This article is associated with the project Semantic MediaWiki.