MediaWiki Bulk Page Creator

From Meta, a Wikimedia project coordination wiki

Jump to: navigation, search
Blue Glass Arrow.svg MediaWiki logo.png
This page should be moved to MediaWiki.org.
Please do not move the page by hand. It will be imported by a MediaWiki.org administrator with the full edit history. In the meantime, you may continue to edit the page as normal.


A Mediawiki bot written in PHP to create and prefill pages on a Mediawiki website or project. The bot takes input from a formatted input file and creates pages. The Snoopy class library is required to make this bot work.

If you have questions please contact me joncu[NOSPAM]trerATgmail DOTTcom

Contents

[edit] The program

<?php
#  PHP MediaWiki Bulk Page Creator
#  Version: 1.0
#  Author: Jonathan Cutrer
#  Website: http://jcutrer.com/
#
#  
#  This program must have the Snoopy Class Library to run.
#  http://sourceforge.net/projects/snoopy/
#
#
#  Syntax: php bulkinsert.php inputfile.txt
#
#
#
include "./Snoopy-1.2/Snoopy.class.php";
$snoopy = new Snoopy;
$wikiroot = "http://yourwikiurl.com";
$login_url = $wikiroot . "/index.php?title=Special:Userlogin&action=submitlogin";
#$submit_url = $wikiroot . "/index.php?title=Special:Userlogin&action=submitlogin";
# Set the username and password below:
$login_vars['wpName'] = "YourRobotsUsername";
$login_vars['wpPassword'] = "Password";
$login_vars['wpRemember'] = "1";
# Login to Wiki
$snoopy->submit($login_url,$login_vars);
# Open Source File and Read into $contents
$fp = fopen($argv[1], "r");
$contents = fread($fp, filesize($argv[1]));
fclose($fp);
# Split $contents in $pages array
$pages = split("--ENDPAGE--", $contents);
# Loop for each item in pages array
# During loop we will get edit page for token then submit form to create page
foreach ($pages as $key => $value) {
        list($title, $body) = split("--ENDTITLE--", $value);
        echo $title;
        # Get rid of newlines in title
        $title = str_replace("\n", "", $title);
        # Make Safetitle for URL
        $safetitle = rawurlencode(str_replace(" ", "_", $title));
        # Lets make sure $title contains something other than null
        if ($title) {
                # Submit to edit page for $title and get contents into $editpage
                if($snoopy->submit($wikiroot . "/index.php?title=" . $safetitle . "&action=edit",$login_vars)) {
                        $editpage = $snoopy->results;
                }
                #echo "$editpage";
                # Pick out Edit Token into $token
                $ans = preg_match('/.*value="(.*?)".*name="wpEditToken"/',$editpage, $matches);
                $token = $matches[1];
                echo $token;
                # Set Post Variables before submitting
                $submit_vars['wpTextbox1'] = $body;
                $submit_vars['wpSummary'] = "";
                $submit_vars['wpSection'] = "";
                $submit_vars['wpEdittime'] = "";
                $submit_vars['wpMinoredit'] = "1";
                $submit_vars['wpSave'] = "Save page";
                $submit_vars['wpEditToken'] = $token;
                # Submit  or Post to create the page
                echo "Final Submit goes to:" . $wikiroot . "/index.php?title=" . $safetitle . "&action=submit";
                if($snoopy->submit($wikiroot . "/index.php?title=" . $safetitle . "&action=submit", $submit_vars)) {
                        $finalresults = $snoopy->results;
                }
                echo $finalresults;
        # End If Loop
        }
# End ForEach Loop
}
exit;
?>

[edit] An example input file

Sample Page 1
--ENDTITLE--
__NOTOC____NOEDITSECTION__
This is the body of sample page one.
This page was inserted by [http://jcutrer.com/ Mediawiki Bulk Page Creator].
If you find this software useful please give me credit by providing a link to http://jcutrer.com
[[Sample Page 2]]
--ENDPAGE--
Sample Page 2
--ENDTITLE--
This is sample page 2 with sections
== Section 1 ==
[[Sample Page 1]]
== Section 2 ==
This page was inserted by [http://jcutrer.com/ Mediawiki Bulk Page Creator].
== Section 3 ==
If you find this software useful please give me credit by providing a link to http://jcutrer.com
--ENDPAGE--


[edit] Disclaimer & License

Mediawiki Bulk Page Creator is release under the GPL License.

Install this software at your own risk there is no warranty or support.

[edit] Related scripts

I have developed a superior version of this tool called Mediawiki CSV Import, its not open source but very inexpensive ($29). mwcsvimport is web-based, uses the MediaWiki API, and is more flexible since it takes CSV data as the import source. You can also format the data on import with page templates. Mediawiki CSV Import page - Jonathan Cutrer

Here is a companion script to go with it. It sucks down data from a UseModWiki site in the format usable by this script. Try this: http://www.hudsonic.com/migwiki/bulkget-umw.php.txt

Also a blatant hacking for image uploading in bulk:

<?
#  PHP MediaWiki Bulk media uploader
#  Version: 0
#  Author: Anonymous Coward, hacking Jonathon Cutrer
#
#  This program must have the Snoopy Class Library to run.
#  http://sourceforge.net/projects/snoopy/
#
#  Syntax: php bulkmedia.php names_and_filepaths.txt
#
# names_and_filepaths.txt has lines, each with a desired filename en wiki,
# then a space character, then a path to the desired file to upload.  no
# spaces in the name or path, left as an exercise.

include "./Snoopy-1.2.3/Snoopy.class.php";
$snoopy = new Snoopy;
$wikiroot = "http://somewikiorother.org/root";
$login_url = $wikiroot . "/index.php?title=Special:Userlogin&action=submitlogin\
";
#$submit_url = $wikiroot . "/index.php?title=Special:Userlogin&action=submitlog\
in";

# Set the username and password below:
$login_vars['wpName'] = "botname";
$login_vars['wpPassword'] = "botpass";
$login_vars['wpRemember'] = "1";

# Login to Wiki
$snoopy->submit($login_url,$login_vars);

# Open Source File and Read into $contents
$fp = fopen($argv[1], "r");
$contents = fread($fp, filesize($argv[1]));
fclose($fp);

# Split $contents in $pages array
$pages = split("\n", $contents);

# Loop for each item in pages array
# During loop we will get edit page for token then submit form to create page

echo $wikiroot . "/index.php?title=Special:Upload";
echo "\n";

foreach ($pages as $key => $value) {

        list($fname, $fpath)=split(" ", $value);
        if ( $fname && $fpath )
        {
                $formvars['wpDestFile'] = $fname;
                $formvars['wpUpload'] ="Upload file";
                $formfiles['wpUploadFile'] = $fpath;
                $snoopy->set_submit_multipart();
                if($snoopy->submit($wikiroot . "/index.php?title=Special:Upload\
", $formvars, $formfiles))
                {
                        echo "success $fname\n";
                }
                echo $snoopy->results;
        }


} # End ForEach Loop
exit;
?>