REST

From Meta, a Wikimedia project coordination wiki

This is a proposal to implement a lightweight HTTP based API for use by scripts and bots. It's partially based on the needs discussed in bugzilla:208. At the moment, this is basically a personal brainstorming by Duesentrieb 16:21, 13 October 2005 (UTC).[reply]

Please note that a similar interface is already in production: Query API

Basic Interface[edit]

HTTP request parameters (URL query arguments) are used to query specific data from MediaWiki. Such queries are usually directed at a specific wiki page. Some functionality may be implemented using a standalone special page. Internally, most of the function suggested below should be implemented as extensions, using hooks.

  • use the action parameter to specify what aspect of a page should be shown (history, undelete, ...). This is already the case.
  • use the gen parameter to specify in what format the result should be. Maybe, a better name would be format or output or something. An example would be to request pages that mainly contain a list in CSV format (see bugzilla:3676). - HTTP already provides for this with the Accept header. No need for a parameter in the URL. -- Jim


Note that sometimes, the action will imply the format (for example, action=raw implies wiki-text format). An example to the contrary would be action=history, which normally shows HTML, but could also be used with gen=csv.

The ask Action[edit]

  • use action=ask to ask specific questions about a page.
  • use the what parameter to specify a set of questions, as a comma separated list. each question is identified by a single, time name. This is not intended for complex queries.
  • MediaWiki will respond with a plain text page, containing one like for every question asked. Each line benins the the identifier of the question, followed by a colon (":"), followed by a comma-separated list of values, each of which is URL-encoded - single values are handeled as a one-value list. Lines are terminated by CR-LF (\r\n). The order of the response lines is unspecified.
  • the MIME type of the response should be text/x-wiki-ask
  • On the server side, a way of mapping the requested properties to functions of $wgArticle, $wgTitle, etc. is to be devised. If a property is unknown, the return value for that question will be empty (or maybe it should be something special, like *, which can never occur in an URL encoded value normally?). Note that some pages (like image pages, user pages, etc) may have additional properties.

An Example: the query

...?title=Image:Foo.png&action=ask&what=location,protected,templates,quux

This would return

location:local     (as opposed to shared or missing)
protected:yes
templates:GFDL,DeletionRequest
quux:              (can't answer that)


See bugzilla:3700 for a proof-of-concept patch.

Maybe there should also be a way to query all metadata for a page, without knowing the properties, f.i. using what=* or some such.

Table Queries[edit]

Many special pages (and some actions) return mainly a list or table of items. Those tables should be available in an easy to parse format, like CSV or some form of XML, using the gen (or format or output) parameter.

CSV[edit]

If CSV is used, the following conventions apply:

  • Lines are terminated by CR-LF (\r\n).
  • Fields are separated by comma (",")
  • Each field is URL-encoded. Technically, this is not part of the CSV encoding, but an additional protocol layer on top of it. URL-Encoding has the advantage to allow the fields to be split at the comma reliably, without having to parse regular CSV quoting. Also, this makes the result imune to scripting attacks.
  • the first line of the result contains the names of the columns (i.e., the CSV always contains a header liner). The is optional if the list is empty (i.e. an empty document is allowed).
  • The mime type should be either text/plain (nicer if viewd by humans in a browser), or text/csv (proposed standard). The charset should be given as us-ascii (which is always ok because of URL-encoding)

Compare RFC 4180, the proposed standard for CSV. See bugzilla:3676 for a proposed patch that would add CSV suppot to many special pages instantly.

Complex Queries[edit]

Complex queries and responses should use stand-alone special pages or special actions, along with task-specific request parameters. Such queries are not in the scope of this REST specification, but should not interfere with it. Here are some pages to look at:

  • RDF - full RDF metadata support.

Event Notification System[edit]

A system of notifying external listeners of evens should be specified, but is not in the scope of the REST spec. For a start, look at bugzilla:3670 for a proposal to make RC notification more flexible. Jabber-Based notification would be a possibility, too.

For high traffic wikis, notification should be passed through a buffered channel (fifo, UDP) to a deamon script that is responsible for the outside communication (IRC, Jabber, etc). This way, the overhead of connection/login for every event can be avoided.

See Also[edit]

  • Similar initiative to bulk-request data and/or metadata from mediawiki is here.