How to do an HTML dump

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

MediaWiki[edit]

  • Download MediaWiki and put it in a directory viewable through Apache+PHP.
  • Configure it. The prefix for French Wikipedia tables is "frwp.".
  • cd skins ; ln -s disabled/HTMLDump.php .

Extensions[edit]

Some extensions are needed on top of vanilla MediaWiki. For instance, {{#if}} only works if ParserFunctions is installed.

require_once( "$IP/extensions/ParserFunctions/ParserFunctions.php" );
require_once( "$IP/extensions/SiteMatrix/SiteMatrix.php" );
require_once( "$IP/extensions/timeline/Timeline.php" );
require_once( "$IP/extensions/wikihiero/wikihiero.php" );
require_once( "$IP/extensions/Cite/Cite.php" );

Math[edit]

  • mkdir images/math images/tmp
  • Make sure Objective Caml >= 1.06 is installed, cd math, make
  • Make sure latex is installed
  • Test: ./texvc ../images/tmp ../image.math 'x+y=1'
  • rm ../image.math/3bc90184258d33db0b561566bd643266.png
  • Edit LocalSettings.php, set $wgUseTeX = true;

Timelines[edit]

  • Install ploticus 2.33
  • After require_once( "$IP/extensions/timeline/Timeline.php" ); in LocalSettings.php, set $wgTimelineSettings->ploticusCommand = "/usr/local/bin/pl";
  • FIXME non-ASCII characters get encoded wrong

Special setup for fr:[edit]

Edit LocalSettings.php, adding:

$wgNamespacesWithSubpages[4] = 1;
$wgNamespacesWithSubpages[100] = 1;
$wgNamespacesWithSubpages[101] = 1;
$wgNamespacesWithSubpages[103] = 1;
$wgNamespacesWithSubpages[105] = 1;
                                                                                       
$wgExtraNamespaces =
    array(100 => "Portail",
          101 => "Discussion_portail",
          102 => "Projet",
          103 => "Discussion_projet",
          104 => "Référence",
          105 => "Discussion_Référence"
         );

Databases[edit]

Need Java Runtime Environment 1.4 or better 1.5 installed. Get mwdumper.jar (ou ici en cas de problème d'Invalid Contributor). Read the instructions for more info.

java -jar mwdumper.jar --format=sql:1.5 frwiki-20100622-pages-articles.xml.bz2 | mysql -u <username> -p <databasename>
gunzip < frwiki-20061014-interwiki.sql.gz | mysql -u <username> -p <databasename>
gunzip < frwiki-20100622-image.sql.gz | mysql -u <username> -p <databasename>
gunzip < frwiki-20100622-langlinks.sql.gz | mysql -u <username> -p <databasename>
gunzip < frwiki-20100622-imagelinks.sql.gz | mysql -u <username> -p <databasename>
gunzip < frwiki-20100622-site_stats.sql.gz | mysql -u <username> -p <databasename>
gunzip < frwiki-20100622-pagelinks.sql.gz | mysql -u <username> -p <databasename>
gunzip < frwiki-20100622-categorylinks.sql.gz | mysql -u <username> -p <databasename>