Page metadata

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

The following page and revision information is available in XML format in the XML/sql dumps. For the xml schema, see the appropriate version of the export xsd.

Page metadata[edit]

  • Title of the page
  • Namespace of the page
  • Page id (for old versions these are shown in the URLs of page history links)
  • If the page is a redirect, the title of the redirect

The entire page table dump is also available; see Database tables for more information about it.

Revision metadata[edit]

  • Id of the revision
  • If the edit was marked as a 'minor' revision by the editor
  • Date and time the edit was made
  • Username and user id, or IP address of the editor
  • Comment left by the editor when the edit was saved
  • Length in bytes of the revision content
  • Sha1 sum of the revision content
  • Revision id of the previous (parent) revision
  • Content model of the revision (is it wikitext? json?)
  • Content format of the revision

Additionally the id of the related entry in the text table is provided.

Content[edit]

In content dumps, almost all of the same metadata is provided, and the full content of included revisions is also written.

Not available in the XML files[edit]

Other metadata about a page is available in the aforementioned page table dump only, and includes:

  • If the page is protected
  • Whether the page is newly created or has more than one revision
  • Id of the most recent revision of the page
  • Length in bytes of the content
  • Content model of the page
  • Content language of the page

See also[edit]

See en:Wikipedia:Edit summary#Places where the edit summary appears for lists of edit metadata.

For details see the documentation of the MediaWiki database.

See also mw:Help:export, RDF metadata