Talk:Community Tech/Migrate dead external links to archives
- Oh, thanks for pointing that out. I'll go and reply to him. -- DannyH (WMF) (talk) 20:42, 17 May 2016 (UTC)
Adding query against alternative archive if not present in Wayback Machine due to robots.txt or other reasons
Websites with robots.txt restrictions will not be captured by the Internet Archive's global Wayback crawls, and even content captured in the past from a given host will not be displayed if/when robots.txt restrictions are added. For this project, how often do the dead links not have corresponding versions in the Wayback Machine? If this happens a non-trivial amount, could be good to subsequently check against and/or Memento (http://timetravel.mementoweb.org/) or (more narrowly) Archive-It (wayback.archive-it.org); these archives may contain captures irrespective of robots.txt restrictions.