Data dumps/What's available for download
Available for download per project
- See also : Database_field_prefixes
- Most database tables as sql files:
*.sql.gzfiles match the name of the corresponding tables, see database layout.
- Page-to-page link lists (pagelinks, categorylinks, imagelinks, templatelinks tables)
- Lists of pages with links outside of the project (externallinks, iwlinks, langlinks tables)
- Media metadata (image, oldimage tables)
- Info about each page (page, page_props, page_restrictions tables)
- Titles of all pages in the main namespace, i.e. all articles (*-all-titles-in-ns0.gz)
- List of all pages that are redirects and their targets (redirect table)
- Log data, including blocks, protection, deletion, uploads (logging table)
- Misc bits (interwiki, site_stats, user_groups tables)
- Text of current or all revisions of all pages, as an XML file
- Metadata about each page and current or all revisions, as an XML file
- Media bundles for each project, separated into files uploaded to the project and files from Commons
Projects with Flagged Revisions enabled have the corresponding tables available for download.
Sql files for testing, generated from the page metadata and page content XML files have been made available for the February 2013 dump run of the English language Wikipedia, for use with MediaWiki 1.20 . Before blindly using them, please note that these do not have the usual drop/create tables stanzas at the beginning. We hope to make these available for every project on a regular basis.
Tab-delimited files for use with MySQL's LOAD DATA INFILE, generated form the Sql files for the February 2013 dump run of the English language Wikipedia are also available for testing  for MediaWiki 1.20. We hope to make these available for all projects on a regular basis as well.
Media bundles for each project are available from a mirror site, via http, ftp or rsync: see Media tarballs on our list of mirrors.. If you want to browse or retrieve the original media as individual files, that's available too; see Media on our list of mirrors.
The Wikimedia Foundation has permission to use certain images, and many of the fair use images are borderline in terms of whether they can be used or not off Wikipedia. If you choose to download the image base, you do so at your own risk and assume all liability for the use of any images on the main Wikipedia site. The Wikipedia Community vigorously police the site and remove infringing images daily, however, it is always possible that some images may escape this extraordinary level of vigilance and end up on the site for a short time. As of February of 2007, the entire collection of images produce a compressed tar.gz file of over 213 GB (gigabytes). As of November 2011 the image and other media files take up about 17T, most of it already compressed media.
Data not available for download
Some data is not available because it's private. This includes user data such as passwords, e-mail addresses, preferences, watchlists, etc. Likewise, deleted or suppressed content is not available for download; it may have contained spam, personally identifying information, copyright violations or other sensitive material.
It's not clear how a full right to fork could be guaranteed.
Some things people want are on a wish list of other items, which you can add to if you like.