Fixing dead links
Appearance
This is a documentation page about different tools that are currently being used on Wikimedia projects to identify and fix dead external links. (See: Migrate dead external links to archives for current work being done on English WP.)
In the following notes, "Progress" means: How many need to be processed? Is there a backlog? Is the number of fixes keeping up with the number of broken links identified?
Example section
[edit]Please copy this to the bottom of the page, and replace the italics with the details about your wiki.
Important: See the other language sections below, for other useful details and examples that might be relevant at your wiki.
- Tool: The name of the bot (or tool) which currently fixes these
- Active: Currently active? Inactive since when?
- Creator/maintainer: The name of the user who runs the bot
- Bot contributions: link to w:xx:Special:Contributions/botname
- Identifies dead links: How does the bot identify deadlinks?
- Source of archive: Which website does it obtain archived-links from?
- Fix: Describe what edits the bot makes, with example links
- Notification: How does the bot notify editors of what it has fixed, if it does?
- Logs: Link to the logs, if available
- Citation templates affected: List all relevant templates (or link to an existing list)
- Other relevant links: Anything else?
- Examples: Link a few example diffs, showing exactly what the bot does in different situations
English WP
[edit]- Tool: Cyberbot II
- Active: Currently active
- Creator/maintainer: User:Cyberpower678
- Bot contributions: Contributions/Cyberbot II
- Identifies dead links: Scans pages that have been marked with Template:Dead link.
- Source of archive: Sends request to Internet Archive (link) for link to archived page
- Fix: If there's a cite template, adds archiveurl, deadurl and archivedate parameters. If there's no template, adds Template:Wayback. Removes Template:Dead link. (see below for examples)
- Notification: Posts message on article talk page: example. Message explains which archive link was changed, and asks humans to review the edit. Offers {{cbignore}} template if the editor doesn't want Cyberbot to fix the link, or {{nobots|deny=Internet ArchiveBot}} if they want to keep Cyberbot off the page. Also adds {{sourcechecked|checked=false}}, and asks editors to switch it to true once the source has been checked. Some of them do get checked, for example: Talk:Pat Morita within 3 days, Talk:Cycling in New York within 2 days, Talk:Thai Air Cargo within a day (although they try to add cbignore on the talk page).
- Logs: User:Cyberbot II/Dead-Links Log, IAbot hashtag log, Deadlink logging app for Cyberbot
- Citation templates affected: (Which templates, which parameters?)
- (Does Cyberbot flip the switch for dead-url to true? Which templates? Is that automated?)
- Progress: Working on Category:Articles with dead external links. There were 130k in December '15, 111k at the start of Feb '16.
Examples
[edit]- Template:Cite news -- adds deadurl, archiveurl, archive date -- diff
- Template:Cite web -- ditto -- diff
- No template -- adds Template:Wayback with url and date -- diff
French WP
[edit]- Tool: Liens archives (ArchiveLinks) gadget, Wikiwix
- Active: Currently active
- Creator/maintainer: Pmartin, from Linterweb
- Bot contributions: Not a bot; this is a gadget set by default for all users on fr.wp. The gadget provides [archive] links next to every external link in the References section of article pages. These link to archive pages on archive.wikiwix.com. Another gadget, ExtendedArchiveLinks, provides archive links for every external link on the page, not just in References. ExtendedArchiveLinks is not on by default.
- Identifies dead links: It doesn't; Wikiwix automatically archives all new external links on Fr.wp en.wp hu.wp.
- Source of archive: archive.wikiwix.com
- Notifications: n/a
- Logs: n/a
- Citation templates affected: n/a
- Progress: n/a
- Relevant links:
- Example of references section with Wikiwix links: Modélisme ferroviaire#Références
- Blog de wikiwix
German WP
[edit]- Tool: GiftBot, using plugins dwl*.{sh,tcl}
- Active: Currently active
- Creator/maintainer: Giftpflanze
- Bot contributions: Beiträge/Giftbot
- Identifies dead links: Fetches all external links via API. Tests them five times with an interval of two weeks. Also see https://phabricator.wikimedia.org/T122659#1967864
- Source of archive: Internet Archive API, webcitation.org API
- Fix:
- Editors verify and process the talk page notifications.
- User created and verified lists in de:Wikipedia:WikiProjekt_Weblinkwartung/Botliste for replacement in article namespace. It is a simple wiki-table "
| Lemma || oldurl || newurl
" executed by de:user:Luke081515Bot (log.)
- Notification: Posts a message on article talk page with the dead link, and suggested archive URL for editors to fix. Uses template: Defekter Weblink (Example: Diskussion:CD Everton de Viña del Mar) and Nicht archivieren. (Also Anker.)
- Log: https://tools.wmflabs.org/giftbot/dwl21.out (112MB! and rising)
- Citation templates affected: —
- Progress: See de:User:GiftBot/Testseite for number of transclusions of talk page notification templates. The format is old template[/new template] (this data was started a bit late in the process, which had already started a few months earlier, the new bot run is not yet finished, so data on how the transclusions fall is mostly missing except for phases where the bot was halted)
- Relevant links:
Italian WP
[edit]- Tool: Bottuzzu
- Active: Currently active (but does many other things as well)
- Creator/maintainer: User:Vituzzu
- Bot contributions: Contributi/Bottuzzu
- Identifies dead links:
- Source of archive: Internet Archive
- Fix: Changes
url
in the cite template tourlarchivio
and adds (date is example)https://web.archive.org/web/20160101000000
in front of the URL. - Notification: explains what it did in the edit summary
- Citation templates affected: Cita news, Cita web, Cita pubblicazione
- Progress:
- Relevant links:
Examples
[edit]- Template:Cite news -- adds urlarchivio, dataarchivio -- diff
Spanish WP
[edit]- Tool: Elvisor (documentation?)
- Active: Since May 2012 (authorization)
- Creator/maintainer: UA31
- Bot contributions: Contribuciones/Elvisor
- Identifies dead links:
- Source of archive: Internet Archive
- Fix: In existing citation templates, adds new fields: urlarchivo=(link with http://web.archive.org/ added)</nowiki>, fechaarchivo=(date of archive) – (example)
- Notification: Posts message on talk page: Enlaces rotos (Broken links), URL of broken link, signature (example)
- Citation templates affected: The bot looks indiscriminately at external URLs (see comment). These citation templates will probably be affected.
- Progress:
- Relevant links:
See also
[edit]- DeadlinkChecker - a PHP library for testing whether links are alive or dead