Jump to content

InternetArchiveBot/FAQ

From Meta, a Wikimedia project coordination wiki


This page contains a list of common questions asked about InternetArchiveBot.

Q: How can I run the bot on a set of pages I need it to be run on?

A: You can use the bot queue submission tool to do just that. If you have a list of pages that need to be analyzed by the bot, put the list of articles you want to send to the bot, an article on each line, and click submit. You're already done. Your submission will be assigned a job ID, and will be placed in the queue until a process is able to work on it. It can then be tracked in real time on the interface.

Q: The bot changed the archive URL to its original counterpart, on a citation template. The original URL is dead, however. What happened?

A: IABot simply moved the archive URL to the correct parameter, the one designed for archive URLs, and left the original URL in the original URL field. This is to keep it consistent with other citation templates being used, as well as conform to its intended use. If applicable, archives should be placed in the appropriate archive URL parameter, and the original URL should be placed in the appropriate URL parameter.

Q: The bot keeps messing up with a specific source on the page. How can I stop this?

A: That depends, if it's persistently trying to tag the source as dead, or is persistently adding a bad archive, then you should help the bot by telling it the archive is no good by using the URL management tool. This tool lets users look up URLs the bot encountered and to fix any errors that are associated with the URL. This allows the bot to become more reliable. If the bot is breaking the syntax, or is mis-formatting the source, you should report that with the bug reporting tool.

Q: The bot is making bad edits to all the sources on the page. I don't think it's worth having the bot run on this specific page, how do I stop the bot from editing the page completely?

A: Be careful on this. Are you sure the bot is harming the page more than it is helping? Consider this, is it breaking the formatting of the page by making disruptive edits, or is it simply mis-tagging sources, and providing bad archives for others? If you answered the former, please report this with the bug reporting tool, as it is a bug needing fixing. Otherwise, if you are certain that the current sources on the page will not benefit from the bot's work, or that you will end up cleaning up after the bot more, then you can place {{bots|deny=InternetArchiveBot}} on the article, or use {{cbignore}} on the individual sources that the bot will screw up. This will keep the bot away from the page. This also means link rot will not be addressed on that article until the tag is removed again. Also, please help the bot become more reliable by fixing bad data with the URL management tool and the domain management tool.

Q: What is {{cbignore}}? What is {{bots}}?

A: {{cbignore}} is a specific blank template some wikis use to signal IABot to completely ignore a reference or external link on a page. An example of it's documentation and usage can be found at w:en:Template:Cbignore. {{bots}} is an exclusion template some wikis use to signal compliant bots to stay away from a page altogether. IABot is bots and nobots compliant, and to specifically keep away InternetArchiveBot, use {{bots|deny=InternetArchiveBot}} anywhere on the page to keep away only IABot.

Q: The bot tagged a source as dead, but the source isn't dead. What happened?

A: If the site was down only temporarily long enough, the bot may have considered the source as dead, because the site failed to validate as alive 3 times in a row, during 3 separately spaced out checks, or the site has blacklisted the bot from further access and is unable to assess the heartbeat of the site. At this point the bot now considers it permanently dead, and you should report it with the false positive reporting tool. This tool will in most cases correct the issue on the bot on it's own. If it can't, it will get reported to the interface roots.

Q: The bot mangled the page. What happened?

A: Please report it with the bug reporting tool, and do not apply {{cbignore}} or {{bots|deny=InternetArchiveBot}}. The page will be used to replicate the bug, and chances are when the bug is fixed, the bot will prove to be useful to the page.

Q: The number of links rescued is different from actual links rescued. Is something wrong?

A: Don't be alarmed, this can happen. If the number actually fixed is lower than advertised, then that means something went wrong with the link, and it got skipped. This is easily fixed by manually fixing the source using the provided source, if it works. If it is higher, then that means there are 2 sources that are identical character for character, and was picked up once internally, but got replaced more than once externally. This is nothing to be worried about.