Talk:Importing a Wikipedia database dump into MediaWiki

From Meta, a Wikimedia project coordination wiki

Enabling category links[edit]

I have successfully installed the English wikipedia from the xml file using the mediawiki software, but the category links don't exist for the "_current" file. So I noticed a .sql file in the download directory, specifically for the category links, but these don't seem to match at all - I insert them and the categories point to all random pages. I even tried installing the FULL English wikipedia, but the category links still didn't work. Is there a script or some process anyone has found to make the category pages work as they do on Wikipedia proper? 02:52, 19 October 2005 (UTC)

Even I am facing the same problem. The category pages are coming all wrong. And this persists across various XML dumps of wikipedia. I even went through the code and found that the categorylinks table has a page_id, which corresponds to the page to be shown in that category. This page_id itself is wrong and hence, there is little that can be done with the code. It most probably has a problem with the xml/sql dump itself. It would be sufficient even if I have a mapping of the id (cl_from) in the categorylinks table to a title (page_title) in the page table.

Instructions to handle XML ... but in French[edit]

It may be valuable to translate this page [1] which explains how to handle XML dumps and import them into MySQL. Bbullot 12:21, 30 September 2005 (UTC)[reply]

You're killin' us[edit]

It would be really really useful if we could get dumps of cur tables more frequently. Right now it has been a month and a half since the last SQL cur dump for en.wikipedia. Lots of projects use these to perform offline analysis (for example many of the WP cleanup projects), and dumps this far out of date are almost useless for that purpose. What's the deal? Can we do anything to help dumps happen faster? en:User:Brighterorange 128.2.203.136 18:36, 2 September 2005 (UTC)[reply]

I second this request. Rex (NL) 22:41, 4 September 2005 (UTC)[reply]

XML dumps[edit]

I see new XML dumps, but no instructions on use. Also yesterdays seem very small. Rich Farmbrough 16:15, 6 September 2005 (UTC)[reply]

E.G. 7.2M for cur, 177M for full. Rich Farmbrough 17:37, 6 September 2005 (UTC)[reply]
The backup log indicates something went wrong... Rich Farmbrough 23:32, 6 September 2005 (UTC)[reply]


XML Dumps: Documentation Request[edit]

Forgive me if this was covered elsewhere, but could someone post a reliable guide to importing the XML dumps? The import instructions given reference only the older SQL dumps, and since those are no longer being updated, it would be extremely helpful to have an XML import guide in the fashion of the (now depreciated) SQL import guide.

Software for creating the database dump web page?[edit]

I'm not sure this is the right place to ask, but I can't find any other page where this question would be more relevant. Where can I download the software that is used to create the database dumps which are made available on http://download.wikimedia.org/ ? We're using WikiMedia internally at our company, and we've gotten a request to make database dumps of our Wiki available from a web site, for stand-alone/laptop installations. Before I "reinvent the wheel", I'd just like to make sure that any code that was used to make http://download.wikimedia.org/ work isn't already publically available. 136.182.2.221 19:21, 23 September 2005 (UTC)[reply]

SQL format[edit]

Dear wiki developer, please reconsider for using the sql format database dump. It is a lot easier than the xml format. The compressed output file size is not much different between those formats.

I do not hate this xml format. You make good english wikipedia with 700.000+ articles but no 1 article good enough for explaining the xml format. I keep searching the how-to from one web to another (mwdumper,xml2sql pearl,xml2sql java), still no one have successfully import the xml format successfully. The last try is to convert the xml to sql using mwdumper (indonesian wikipedia), but when running the sql import, mysql crashed with something like (invalid command or something and go to dos prompt, mysqlfront also crashed when importing the sql).. one more thing, the xml file seems appeared to be non-standard xml (can anybody verify this) because it can not be opened with other program. (delphi,xmllint etc).

It really frustrated! SQL format please..... ! Borgx 05:28, 27 September 2005 (UTC)[reply]


XML Import Instructions PLEASE[edit]

I have searched the web quite thoroughly and have not been able to figure out how to import the XML dumps into an instance of the MediaWiki software. Instructions on how to do this would be most appreciated.

DrDeke 06:51, 27 September 2005 (UTC)[reply]

ImportDump.php Doc[edit]

I wouldn't mind some quick doc on this script as it is referenced. I understand easy enough the java extensions for the mwdump.jar but unfortunently my server doesn't currently have java (believe me I bitched em out for it when I found this out).

I would like to be able to filter with the ImportDump.php script rather than having to extract then entire _current.xml.bz2 file

XML import takes a long time[edit]

I'm currently importing the Spanish Wikipedia using

php importDump.php <~/tmp/20051105_pages_current.xml

This has now already been running for four hours or so and is importing at less than 8 pages/s. I was under the impression, that the old SQL files were much faster. Is there a way to speed this import up or could you consider posting SQL files again? -- Patrice 02:14, 15 November 2005 (UTC)[reply]

No database dumps for ko/ja/zh/...[edit]

For ko/ja/zh wikipedias, there is no recent dumps at http://download.wikimedia.org/ . Please check the dump script. -- ChongDae 01:43, 12 December 2005 (UTC)[reply]

What, if I don't want/can't decompress it on-the-fly???[edit]

On-the-fly import is explained with:

bzip2 -dc cur_table.sql.bz2 | mysql -u myusername -p mydatabasename

But what can I do if I already have decompressed sql files?

cur_table.sql | mysql -u myusername -p mydatabasename  doesn't work
mysql -u myusername -p mydatabasename cur_table.sql    doesn't work
mysql cur_table.sql -u myusername -p mydatabasename    doesn't work

So it's impossible to work with uncompressed files???

I suggest:
cat cur_table.sql | mysql -u myusername -p mydatabasename

--Jrvz 02:04, 11 January 2006 (UTC)[reply]

The content of a SQL-dump can be imported by

mysql -u myusername -p mydatabasename < cur_table.sql

If you want to export a database change the left arrow to a right arrow (> instead of <).

--82.83.43.210 19:47, 14 February 2006 (UTC)[reply]

I am not sure if I did right, but I went step by step:

-Downloaded the xml dump as a 7z package (Note that this isn't wikipedia, but wikia) -Decompressed it -Built mwdumper with Maven -Used mwdumper to convert the xml to a sql -Opened the terminal and: mysql -u username -p Password: source /path/to/sql/sqldump.sql; It took a long time parsing but I couldn't see anything. I decided to do a php update.php: quit; cd /var/lib/mediawiki/maintenance php update.php

It's now updating. --BSODX (talk) 15:59, 20 May 2018 (UTC)[reply]