Help:Downloading pages

From Meta, a Wikimedia project coordination wiki

Downloading a MediaWiki page[edit]

w:Webpage#Saving_a_webpage shows the possibilities for saving a local copy of a webpage. Alternatively (or in addition) one can copy the wikitext, i.e. the text in the edit box (the source code within the database).

Information in the webpage but not in the wikitext:

  • images
  • content of templates referred to
  • values of variables
  • existence at the time of saving of linked internal pages
  • date and time of the last edit before saving
  • in the Image namespace (Image description pages): the image itself, the image history and the list of pages linking to the image
  • in the Category namespace: the lists of subcategories and pages in the category.
  • in the case of wikitext of e.g. the form {{#expr:..}}: the result.

Information in the wikitext but not in the webpage:

  • comments (even though HTML also allows comments)
  • names of variables, parser functions and templates referred to
  • in the case of wikitext of e.g. the form {{#expr:..}}: the numerical expression

Additionally one could save the wikitext and documentation of the templates used, and intermediate results of processing the page, such as the XML parse tree and the expanded wikitext.

See also XML export, MediaWiki architecture.

Downloading some linked pages[edit]

When saving a local copy of some MediaWiki pages, note the following.

A link to e.g. the train article in Wikipedia is given in the HTML-code as /wiki/Train . This refers to http://en.wikipedia.org/wiki/Train . Depending on your browser settings the former may be changed into the latter when saving the page. To avoid this, apply View Source and save that.

Put the copy in folder C:\wiki (another drive letter is also possible, but wiki should not be a subfolder) and do not use any file name extension. This way the links work. Inconvenient is that you can not open a file in a folder listing by clicking on it, because of the lack of a file name extension.

A problem with saving the source code is that images are not saved automatically with the page. Saving them separately in a place corresponding to the HTML code is cumbersome, e.g. the first image of the train article would have to be C:/upload/thumb/c/c2/250px-Tile_Hill_train_550.jpg

If the images are more important than the mutual links, then use the browser option to save the webpage with images.

Of course variations are possible by changing the HTML-code oneself, e.g. changing http://en.wikipedia.org to C: and/or adding the file name extension .htm .

On some sites outside Wikimedia, instead of "\wiki" another folder has to be used, see Help:URL.

When downloading pages from different sites to the same folder \wiki on the same drive, note that a page name can only be used once.

See also[edit]

Links to other help pages

Help contents
Meta · Wikinews · Wikipedia · Wikiquote · Wiktionary · Commons: · Wikidata · MediaWiki · Wikibooks · Wikisource · MediaWiki: Manual · Google
Versions of this help page (for other languages see further)
What links here on Meta or from Meta · Wikipedia · MediaWiki
Reading
Go · Search · Namespace · Page name · Section · Backlinks · Redirect · Category · Image page · Special pages · Printable version
Tracking changes
Recent changes (enhanced) | Related changes · Watching pages · Diff · Page history · Edit summary · User contributions · Minor edit · Patrolled edit
Logging in and preferences
Logging in · Preferences
Editing
Starting a new page · Advanced editing · Editing FAQ · Export · Import · Shortcuts · Edit conflict · Page size
Referencing
Links · URL · Piped links · Interwiki linking · Footnotes
Style and formatting
Wikitext examples · CSS · Reference card · HTML in wikitext · Formula · Lists · Table · Sorting · Colors · Images and file uploads
Fixing mistakes
Show preview · Reverting edits
Advanced functioning
Expansion · Template · Advanced templates · Parser function · Parameter default · Magic words · System message · Substitution · Array · Calculation · Transclusion
Others
Special characters · Renaming (moving) a page · Preparing a page for translation · Talk page · Signatures · Sandbox · Legal issues for editors
Other languages: