How to do an HTML dump
From Meta, a Wikimedia project coordination wiki
Contents |
[edit] MediaWiki
- Download MediaWiki and put it in a directory viewable through Apache+PHP.
- Configure it. The prefix for French Wikipedia tables is "frwp.".
- cd skins ; ln -s disabled/HTMLDump.php .
[edit] Extensions
Some extensions are needed on top of vanilla MediaWiki. For instance, {{#if}} only works if ParserFunctions is installed.
- Go to the extensions directory, do
- svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/ParserFunctions
- svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/Cite
- svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/SiteMatrix
- svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/wikihiero
- svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/timeline
- wget http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/ExtensionFunctions.php
- Edit LocalSettings.php, insert
require_once( "$IP/extensions/ParserFunctions/ParserFunctions.php" ); require_once( "$IP/extensions/SiteMatrix/SiteMatrix.php" ); require_once( "$IP/extensions/timeline/Timeline.php" ); require_once( "$IP/extensions/wikihiero/wikihiero.php" ); require_once( "$IP/extensions/Cite/Cite.php" );
[edit] Math
- mkdir images/math images/tmp
- Make sure Objective Caml >= 1.06 is installed, cd math, make
- Make sure latex is installed
- Test: ./texvc ../images/tmp ../image.math 'x+y=1'
- rm ../image.math/3bc90184258d33db0b561566bd643266.png
- Edit LocalSettings.php, set $wgUseTeX = true;
[edit] Timelines
- Install ploticus 2.33
- After require_once( "$IP/extensions/timeline/Timeline.php" ); in LocalSettings.php, set $wgTimelineSettings->ploticusCommand = "/usr/local/bin/pl";
- FIXME non-ASCII characters get encoded wrong
[edit] Special setup for fr:
Edit LocalSettings.php, adding:
$wgNamespacesWithSubpages[4] = 1;
$wgNamespacesWithSubpages[100] = 1;
$wgNamespacesWithSubpages[101] = 1;
$wgNamespacesWithSubpages[103] = 1;
$wgNamespacesWithSubpages[105] = 1;
$wgExtraNamespaces =
array(100 => "Portail",
101 => "Discussion_portail",
102 => "Projet",
103 => "Discussion_projet",
104 => "Référence",
105 => "Discussion_Référence"
);
[edit] Databases
Need Java Runtime Environment 1.4 or better 1.5 installed. Get mwdumper.jar. Read the instructions for more info.
java -jar mwdumper.jar --format=sql:1.5 frwiki-20060929-pages-articles.xml.bz2 | mysql -u <username> -p databasename>
gunzip < frwiki-20061014-interwiki.sql.gz | mysql -u <username> -p <databasename>
gunzip < frwiki-20060929-image.sql.gz |mysql -u root -p frwp gunzip < frwiki-20060929-langlinks.sql.gz |mysql -u root -p frwp gunzip < frwiki-20060929-imagelinks.sql.gz | mysql -u root -p frwp gunzip < frwiki-20060915-site_stats.sql.gz | mysql -u root -p frwp gunzip < frwiki-20060915-pagelinks.sql.gz | mysql -u root -p frwp
wget http://dumps.wikimedia.org/frwiki/20060929/frwiki-20060929-categorylinks.sql.gz gunzip < http://dumps.wikimedia.org/frwiki/20060929/frwiki-20060929-categorylinks.sql.gz | mysql -u root -p frwp gunzip < frwiki-20060929-categorylinks.sql.gz | mysql -u root -p frwp