WikiText Transfer Protocol
From Meta
- This is a draft. Assume that all details in here are going to change. The author, Astronouth7303, encourages people to revise it and to update as needed.
WikiText Transfer Protocol (WTTP) is HTTP with some special headers and MIME types. It uses HTTP structures instead of content when resonable (eg, custom headers for light meta-data). The goal of this is to create a simple yet expandable machine interface to wikis.
WTTP is activated by the presence of the MIME type "text/x-wiki" in the "Accept" HTTP header. Almost all content, even meta-data, is transmitted in Wikitax. Otherwise, it is exactly the same.
Note that when using WTTP that the "section" URL argument must always be honored.
Contents |
[edit] Abstract
The purpose of WTTP is to give applications ("clients") access to data held by MediaWiki without digging through HTML. WTTP tries to move most of the processing off the server and onto the client, without opening insecurities. Every attempt is made to be consistent without being awkward.
This means that the client does all the processing with the Wikitext. The client parses and displays it. It generates differences between versions and will make the appropriate requests to the server to do so.
[edit] Methods
- GET
- Gets the requested article
- PUT
- Saves the article using the data from the HTTP header (x-wiki-comment, x-wiki-minor, etc.) and the request data. [PHP Note: use 'php://input' to read input data.]
- POST
- Same as EDIT, except that the data is URL encoded. This is deprecated. Almost identical to a regular edit.
[edit] WTTP Headers
Many HTTP/1.1 headers are accepted and used. This is an outline of the WTTP-specific (x-wiki-*) headers and the some of the HTTP headers used. Note that this list of headers is likely to expand over time as needed by the software or by users.
- x-wiki-minor
- Is edit minor? PUT request only.
- x-wiki-comment
- comment of edit. PUT request only.
- x-wiki-id
- Article ID of article/version. (useful if the request was for an article title)
- x-wiki-title
- Title of article ID. (useful if the request was for an article ID.) Any foreign or Unicode characters should be URL encoded (%#### format). This means that percent-signs must be URL encoded (though a wiki should avoid using them in titles).
- x-wiki-media-url
- The URL of the image/media associated with a Image: or Media: page.
- x-wiki-media-type
- The MIME type of the image/media associated with a Image: or Media: page.
- x-wiki-copyright
- The copyright info of an upload.
- age
- In PUT, indicates the age of client's copy. Otherwise it indicates the age of the sent article/section.
- last-modified
- In PUT, indicates the time at which the client last got the page/section. Otherwise it indicates the date of last modification of the sent article/section. This is prefered over age, so will be used if present.
- X-Powered-By
- The version of MediaWiki running. Reply only. (probably "MediaWiki/1.5")
- Content-Type
- Almost always "text/x-wiki".
- Content-Language
- The language code of the page. This is mostly for informational purposes.
- Content-Encoding
- The encoding of the text.
Note: If a client recieves a page then recieves sections, such that different sections of the client's copy have different ages, the age of the page is that of the oldest section, or of the page if not all of it has been recieved.
[edit] Status codes
Instead of sending a page with notice, WTTP will send the appropriate HTTP status code. Here are the basics:
- 200 OK
- Page sent correctly. If the Location header contains a different URL then the one requested, that means a redirect has been handled server side, or that the title was not sent in the prefered MediaWiki form. Check x-wiki-title.
- 201 Created
- After an edit is submitted, this is sent if the article didn't exist before. Everything is OK.
- 301 Moved Permanently
- This should only be supported by servers, as it is not used by WTTP proper, and is reserved in case a wiki moves (ie, its root changes).
- 307 Temporary Redirect
- The article is a redirect. The new article is in the URL in the Location header. The body is the text of the requested article. A 302 Moved Temporarily or 302 Found status may be used instead.
- 400 Bad Request
- Self-explainatory
- 401 Unauthorized
- If a client recieves this, it can mean several things:
- If the request was an edit, it means that the user is not allowed to edit the article.
- If the request was a view, then the user is not allowed to view the article.
- 404 Not Found
- The requested article hasn't been created, or has been deleted.
- 409 Conflict
- Indicates an edit conflict has occured. See Edit Conflicts below.
- 410 Gone
- The page has been deleted (as opposed to not created). Only sent if the user is a sysop, else 404 Not Found is used.
- 500 Internal Server Error
- Self-explanatory
- 501 Not Implemented
- The requested action is not supported. This should be used if the request is parsable, but the values indicate a feature that is not supported (eg, an unknown "action" value).
[edit] Edit Conflicts
If a WTTP-enabled server sends a 409 Conflict HTTP/1.1 response code, then an edit conflict occured. The x-wiki-id header contains the ID of the new version, and the body contains the new version of the text of the section that the client wishes to edit, so that the client may merge it.
[edit] Special Lists
There are several extras that the client needs to know to properly render text and interface with the wiki.
- Todo: Figure out how to request a list.
- __INTERWIKI
- a list of interwiki links in meta-data form. Needed in case of custom InterWiki links.
- __EXTENSIONS
- a list registered XML-tag extensions (ie, <math>T_E\X</math>)
- __CONFIG
- some safe configuration opetions. (eg, license, CreativeCommons enabled, Dublin Core enabled, custom namespaces, site name, version, site's interwiki, uploads enabled, uploads require copyright details, if links are case sensitive, timezone, etc.)
[edit] Meta-data
Meta-data is formatted in the following manner:
When a client's request generates a response that is meta-data rather than page text, it is formatted as a simple pipe-syntax table. The first row is always field names, and the following rows are the data. The field titles are set off as headers ("!" instead of "|"). This format is followed even if there is only one value for all fields.
A table ID may be placed in what would be the caption.
Only the basics of the pipe syntax is used. Each line will contain only one value, no formatting is used, though clients should be prepared to filter out table-wide attributes.
Simple example:
{| filter this out. it contains no information useful to the client
|+ Some table
|-
!field 1
!field 2
!field 3
!field 42
!useful
|-
|foo
|bar
|baz
|Qux
|no
|-
|0b
|1b
|10b
|11b
|no
|-
|spam
|eggs
|needle
|haystack
|no
|}
Example (a history listing):
{|
|-
!date
!id
!user
!comment
!minor
|-
|Mon, 01 Jan 2000 07:05:03 GMT
|1
|a_user
|just some edit
|no
|-
|Tue, 02 Jan 2002 05:06:04 GMT
|13
|joe_bob
|didn't like a_user's edit
|no
|-
|Mon, 24 Jan 2004 02:42:42 GMT
|42
|foobar
|joe_bob didn't link anything, now fixed.
|yes
|}
While this doesn't represent the best of editting practices, it does show how the data is laid out. (If you are unsure, think of a database table.)
This example is the meta-data of the page represented in the previous example:
{|
|-
!date
!id
!title
!title-url
!max-section
!author-list
|-
|Mon, 24 Jan 2004 02:42:42 GMT
|42
|Main Page
|Main_Page
|0
|a_user, joe_bob, foobar
|}
Note that the field names in these examples are just examples. They are not necessarily the final names.
[edit] Specifics
- The ID (caption) of a table is optional. They are there so that in the future, if there are queries that would return multiple tables, they can be differentiated.
- Unless otherwise noted, none of the fields should be omitted. If they are, the client should make assumtions about the data based on the query sent.
- field names and table IDs should be a specific case, but clients should not be case sensitive. (Meaning that fields should be the case that is given, but clients should be ready to deal with erronious field names from wrong case.)
- Boolean values are represented by "yes" for true and "no" for false.
Below is the definitions of the meta-data tables. Elipses represent where the rows are.
[edit] Recent Changes
Returned by Special:Recentchanges.
{|
|+ Recent Changes
|-
!ID
!Title
!User
!Date
!Comment
!Minor
|-
...
|}
- ID
- The ID of the edit
- Title
- The title of the page editted
- User
- The name of the user who edited the page
- Date
- The date of the edit, in RFC 822 form; that is, date('r') form.
- Comment
- The comment of the edit
- Minor
- Was the edit minor
- Each entry represents exactly one edit
- It is up to the client to create 'enhanced' recent changes
[edit] Find Results
Returned by Special:Search. Each row contains one page that matches the serch terms.
{|
|+ Find Results
|-
!ID
!OldID
!Title
!NumBody
!Relavence
|-
...
|}
- ID
- The ID of the edit.
- OldID
- The ID of the previous edit.
- Title
- The title of the page.
- NumBody
- The number of times it appears in the body.
- Relavence
- A server-generated relavence rating. May be omitted based on capabillities of the server and software. Lower numbers are better, 0 is the best, and reserved for pages whose title exactly matches the search terms. (This is so that there is no upper limit.)
- Each entry in the table represents exactly one page which matches
- It is up to the client to highlight the page (Should this change?)
[edit] Page Info
Returned by the info action. (That is, [[Title?action=info]].) Each row contains the info of one page. Curretnly, there should only be one row. If there is more than one, the client should use the first one.
{|
|+ Page Info
|-
!ID
!Title
!NumSections
!Editor
!EditTime
!Editors
!Creator
!CreateTime
|-
...
|}
- ID
- The ID of the page.
- Title
- The title of the page.
- NumSections
- The number of sections (optional).
- Editor
- The name of the last user to edit the page.
Editorsmay be used instead of this. - Editors
- A comma-seperated list of the names of the last X users to edit the page, most recent first, where X is the configuration setting to display editors.
Editormay be used instead of this. - EditTime
- The date/time the page was last edited.
- Creator
- The name of the user that created the page.
- CreateTime
- The date/time the page was created.
[edit] User Options
Returned by Special:Preferences. If the user is not logged in, returns the default options.
- Todo: Make list of configuration options.
[edit] Watchlist
Returned by Special:Watchlist. Identical in form to Recent Changes
[edit] Contributions
Returned by Special:Contributions. Identical in form to Recent Changes.
[edit] Navigation Bar
A list of links along the left (by default) side of the Monobook skin. The order is the same as it appears. So the link closest to the top ("Main Page") would appear first in the table.
{|
|+ Navigation Bar
|-
!Title
|-
...
|}
- Title
- The title of the page linked.
[edit] Links
Returned by Special:Whatlinkshere. Identical in form to Navigation Bar
[edit] Interwiki
Contains all Interwiki prefixes used by the wiki.
{|
|+ Interwiki
|-
!Prefix
!URL
!Local
|-
...
|}
- Prefix
- The name used inside of links (eg, "WikiPedia", "MetaWikiPedia", "Wiktionary", "FIRSTwiki", etc.)
- URL
- The URL used by the parser. (eg, "http://en.wikipedia.org/wiki/$1", "http://meta.wikipedia.org/wiki/$1", "http://en.wiktionary.org/wiki/$1", "http://firstwiki.org/index.php/$1")
- Local
- A boolean value indicating whether the link is treated as a local.
[edit] Extensions
Contains all XML-style syntax extensions.
{|
|+ Extensions
|-
!Name
|-
...
|}
- Name
- The name used within the tag. (eg, "math")
[edit] Tables that need to be defined
[edit] Site Config
This contains the configuration of the wiki, as mentioned above.
- Todo: Make list of configuration options.
[edit] User info
Statistics about a user.
- Todo: Make list of fields.
[edit] New Special pages
There are several new special pages needed for this to work.
- Render
- Returns the givin wikitext fragment rendered in HTML as type text/html. It is incapable of returning text/x-wiki. It is used to render extensions and such.
- Config
- returns a list of the configuration options mentioned above.
- Interwiki
- returns a list of Interwiki links supported by the server.
- Extensions
- returns a list of XML-style extensions that is mentioned above.
[edit] Future Expansion
Currently, there are two ways to expand the protocal:
- The WTTP headers
- The meta-data fields
[edit] WTTP Headers
As needs grow, more headers should be added in the form of x-wiki-*, eg: x-wiki-categories, x-wiki-linked-from, etc.
[edit] Meta-data fields
A client should look for fields it understands and extract data in that column. An order should not be assumed. The presence of certain headers should not be assumed, unless specified otherwise. This way, we can add and rearrange columns as needed.

