From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search


A screenshot of the Quarry SQL writing interface after execution completed. The results set appears directly beneath the query writing pane.
Query execution. A screenshot of the Quarry SQL writing interface after execution completed. The results set appears directly beneath the query writing pane.

Quarry is a public querying interface for Wiki Replicas, a set of live replica SQL databases of public Wikimedia Wikis. Quarry is designed to make running queries against Wiki Replicas easy. Quarry also provides a means for researchers to share and review each other's queries. Users of Quarry are required to agree to Cloud Services Terms of use.

Writing & executing SQL queries[edit]

Choosing a database[edit]

By default Quarry queries the public database of English Wikipedia ("enwiki_p"). You can use the database of any other wikisite by prefixing your query with USE <dbname>_p;. For example, the following SQL returns the count of revisions saved by "EpochFail" in metawiki.

USE metawiki_p;
FROM revision_userindex 
    rev_actor = (SELECT actor_id
                 FROM actor
                 WHERE actor_name = "EpochFail") AND 
    rev_timestamp >= "20140101";

The name of the database is usually the language code concatenated with the wiki family (wiki for Wikipedia, wikibooks etc.). For multi-language wikis it's the name of the wiki and wiki, e.g. commonswiki, wikidatawiki. See for the full list.

Available tables and columns[edit]

For an overview of the available tables and columns see MediaWiki database layout. Some extensions add additional tables which can be queried, too, these should be documented on the extension's documentation page. The Toolforge SQL Optimizer has a database browser that helps to find the needed table and understand its format. Note that not all tables are available. The following data are missing:

  • The table text containing the actual content of the revisions is missing.
  • Private data is suppressed.

Note that this means that the revision and logging tables do not have indexes on user columns. You can use revision_userindex and logging_userindex instead (see wikitech:Help:Toolforge/Database#Tables_for_revision_or_logging_queries_involving_user_names_and_IDs)

To get a list of all tables you can run the query (replace enwiki_p with the respective wiki):


To get the schema of a table, you can use DESCRIBE (e.g.

Downloading a resultset[edit]

Quarry data.png
Download data button. 

Quarry lets you download the result of a query. A data download button appears right before the resultset to get query results in CSV, TSV, JSON or wikitable formats. You can also request resultsets programmatically from wiki tools.

{"headers": ["COUNT(*)"], "rows": [[88]]}

Note that query_run_ID (e.g. /run/10140) it's not the same as query number (e.g. /query/153) which is visible in URL. The query run ID is given in the HTML source code of the query page as "qrun_id".

Example queries[edit]

Interwiki link[edit]

The interwiki map link quarry: exists to provide an alternate linking means to queries. eg.

See also[edit]