User talk:InternetArchiveBot

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

Connect with the developers and other users[edit]

Telegram IRC ( #iabot)

Operation status[edit]

For the most up to date information see the run pages

  • 🟢 InternetArchiveBot is currently running on 100+ Wikimedia wikis
  • 🔴 Bot is approved but disabled indefinitely pending software improvements on Arabic Wikipedia (ar), Bulgarian Wikipedia (bg), German Wikipedia (de), Finnish Wikipedia (fi), French Wikipedia (fr), Hebrew Wikipedia (he),, Polish Wikipedia (pl), Portuguese Wikipedia (pt), and Rusyn Wikipedia (rue).
  • 🔴 Bot is blocked pending our application for reapproval on Japanese Wikipedia (ja)
  • 🔴 Bot is blocked pending software improvements on Welsh Wikipedia (cy) and Persian Wikipedia (fa)

Last updated: 18:56, 9 May 2022 (UTC)

SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 7 days.

Galician Wikipedia links[edit]

Hi InternetArchiveBot. In your latest editions of the Galician wikipedia I have observed that there are problems in the archived links, which create redundant elements in {{cite web}} templates. Could you check it please? Thank you very much.--Breogan2008 (talk) 08:28, 1 April 2022 (UTC)[reply]

Can you please provide an example? —CYBERPOWER (Chat) 08:34, 13 April 2022 (UTC)[reply]
Breogan2008 making sure you have seen this. Harej (talk) 18:47, 28 April 2022 (UTC)[reply]
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. —CYBERPOWER (Chat) 18:09, 11 May 2022 (UTC)[reply]

Archive to archive[edit]

Please check the bot (on pl wiki). [1] [2] etc. Adding archive to templates with links in url does not look correct (archive links can be in url). Elfhelm (talk) 17:04, 6 April 2022 (UTC) (Blocked for 3 days on pl).[reply]

Hi, Please check the bot (on ar wiki) too. Thanks.--جار الله (talk) 08:23, 9 April 2022 (UTC)[reply]
Hi - same problem still on pl wiki [3] (blocked for 1 months). Please check the bot to stop changing templates using url with / (today, ph, fo) links. Thanks! Elfhelm (talk) 17:49, 9 April 2022 (UTC)[reply]
This is actively being investigated. I just want to make a note here that this request isn't being ignored. —CYBERPOWER (Chat) 01:59, 17 April 2022 (UTC)[reply]

Please take a look at ZH WP[edit]

Nothing can be analyzed now. Please kindly check it. Thanks. Naiveandsilly (talk) 16:16, 10 April 2022 (UTC)[reply]

Can you please provide more details? Most common issues for what you are experiencing is that you may have inadvertently loaded a different wiki. Please check the upper right menu bar and make sure your wiki is selected in the dropdown menu. —CYBERPOWER (Chat) 02:00, 17 April 2022 (UTC)[reply]
Probably related to the below sections? Mako001 (talk) 05:40, 23 April 2022 (UTC)[reply]
Naiveandsilly, making sure you have seen this. Harej (talk) 18:16, 4 May 2022 (UTC)[reply]
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. —CYBERPOWER (Chat) 18:10, 11 May 2022 (UTC)[reply]

Bot stalled (choking on job 9878?)[edit]

Batch job #9878 appears to have some sort of problem. It's currently listed as "Running", and the list of articles in the batch has just one entry; however, the progress bar, weirdly, shows it as having completed running on 2/1 articles (corresponding to what appears to be listed as a progress percentage of 200%, although this is mostly cut off by the right end of the progress bar). The progress bar is also red (indicating a stalled job).

Possibly coincidentally, or possibly not, the last batch job listed as completed is job #9872, which finished just a couple hours shy of twelve days ago. Of the five batch jobs currently listed as running (#9873, #9874, #9875, #9877, and the aforementioned #9878), only #9878 shows any work done (the anomalous results described in the first paragraph); #9873, #9874, #9875, and #9877 all appear to be stuck at straight zeros for pages modified and links analyzed/rescued/tagged. All five of these jobs have been in the pipeline since 6-7 April.

Could someone please take a look at this problem, figure out what's gone wrong, and unjam the bot? Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 18:04, 17 April 2022 (UTC)[reply]

Please visit phab:T308097 to learn more.--Q28 (talk) 09:53, 14 May 2022 (UTC)[reply]

Stalled on job 9878[edit]

Hello, anyone here? It has been stalled at Job 9878 for a few weeks now. Can this job be remotely killed? @Itcouldbepossible: Please could you try to kill this job and see if the bot restarts? Mako001 (talk) 05:34, 23 April 2022 (UTC)[reply]

@Cyberpower678 @Itcouldbepossible, redoing pings Mako001 (talk) 05:39, 23 April 2022 (UTC)[reply]
Also pinging @Harej: Mako001 (talk) 12:02, 23 April 2022 (UTC)[reply]
@Mako001 Thanks for the ping. Yes, I remember about the job that I created. I thought I had made something wrong while creating the job, that was why the bot wasn't working. But it turns out that nothing was wrong from my side. Anyway, I killed the job. It was actually a test job created by me to know how the IA bot works. But how did you know about the job I created? Also pinging Whoop whoop pull up since he was also the one to bring up the issue. Can you please tell me how you figured out that I bot was stuck. My second question is, the list that of article that I created has only 1 article. I have seen users like Abductive and BrownHairedGirl work on these for long periods of time. They don't get their bot job stuck. Then why did my 1 article job work get hanged? Itcouldbepossible (talk) 13:21, 24 April 2022 (UTC)[reply]
That's she to you, and I found it by running through the queued jobs one-by-one until I found the one that'd stalled. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 14:05, 24 April 2022 (UTC)[reply]
I'm not sure why it got stuck, but at a guess it might be due to the sheer size of the individual article and number of refs, which might have caused some sort of overflow? I don't think that you actually did anything that might have resulted in this little incident, besides being the unlucky one to discover the bug. I'd suggest leaving this open, as this is probably going to need some investigation on why the bot had a brain fart. Mako001 (talk) 15:14, 24 April 2022 (UTC)[reply]
Unfortunately, it is still stuck. Mako001 (talk) 15:18, 24 April 2022 (UTC)[reply]
Should this section and the one above be merged? They discuss the same issue. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 21:59, 24 April 2022 (UTC)[reply]

This is actively being investigated. Sorry it’s taking so long. —CYBERPOWER (Chat) 15:51, 24 April 2022 (UTC)[reply]

Thanx! While you're at it, any clue why individual-page runs are still working just fine even though batch jobs are jammed up? Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 22:00, 24 April 2022 (UTC)[reply]
@Whoop whoop pull up: Yes, those are completely separate processes. One runs on-demand, and the other uses a batch of workers to process the jobs. —CYBERPOWER (Chat) 18:20, 4 May 2022 (UTC)[reply]
Any idea of an ETA for when IABot'll be up and running again? Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 20:01, 8 May 2022 (UTC)[reply]

Errors in your edits in Italian wikipedia[edit]

Hi there. I checked some of the last editions of the Australian Open tennis tournament in Italian wikipedia. The links you added are all referred to a wrong edition of the tournament. As an example have a look at my edit of the it:Australian Open 2014 here. Would you please use your bot to fix the other editions from 2010 to 2021? Thanks. --Carlo58s (talk) 09:55, 24 April 2022 (UTC)[reply]

Your request is very unclear. Can you please elaborate?2A02:810D:13C0:35DC:3DBA:EC5C:4A52:4E35 06:25, 29 April 2022 (UTC)[reply]
Carlo58s making sure you have seen this. Harej (talk) 18:18, 4 May 2022 (UTC)[reply]
Harej Please take a look, with this edit] I remove the links posted by InternetArchiveBot because they were archived in 2009 or 2010 for a tournament played in 2014. And with my edit I replace them with correct links. --Carlo58s (talk) 18:50, 4 May 2022 (UTC)[reply]
@Carlo58s: I'm sorry, but, unfortunately, IABot has a limitation where it can only handle one snapshot of a URL, and one access time of a URL. If the access times of these URLs are updated to 2014 and subsequent snapshots are updated in the bot's DB, then the older snapshots will be removed. Addressing this requires a complete restructure of how the bot models its data internally. This is something that IABot 3 will seek to address down the road. —CYBERPOWER (Chat) 18:30, 11 May 2022 (UTC)[reply]
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. —CYBERPOWER (Chat) 17:55, 18 May 2022 (UTC)[reply][edit]

InternetArchiveBot adds archive links to the site, but the site appears to be down.--Yellow Horror (talk) 13:09, 28 April 2022 (UTC)[reply]

It's a bit of a tricky situation. We are considering invalidating the entire archive and replacing them with other working versions. —CYBERPOWER (Chat) 06:28, 29 April 2022 (UTC)[reply]
Pinging Yellow Horror.—CYBERPOWER (Chat) 18:22, 4 May 2022 (UTC)[reply]
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. —CYBERPOWER (Chat) 17:56, 18 May 2022 (UTC)[reply]

Wishlist item: avoid causing named reference multiple cite errors[edit]

Tracked in Phabricator:
task T302917

As seen in this diff, when run on pages containing named references that are defined multiple times with identical content, IAbot only adds an archive link to the first appearance of each particular reference. This provokes angry bold red errors e.g. "Cite error: The named reference ":0" was defined multiple times with different content (see the help page)." My workaround is to manually deduplicate the reference content, replacing later appearances with <ref name="foo" />. It would be nice, however, if IAbot by itself simply copied the same change to all duplicates, or ran some sort of reference content deduplication as seen in AutoWikiBrowser. I apologize if this would be difficult to implement; I just think it would be a nice feature to have. – Anon423 (talk) 06:29, 4 May 2022 (UTC)[reply]

@Anon423: This is actively being worked on. You can track the progress of this in the linked Phabricator ticket.—CYBERPOWER (Chat) 18:25, 4 May 2022 (UTC)[reply]
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. —CYBERPOWER (Chat) 18:38, 11 May 2022 (UTC)[reply]

Thanks for the wonderful work you have done[edit]

Wow 🙏😍😍 Lethulake (talk) 05:19, 8 May 2022 (UTC)[reply]

@Lethulake: you are so very welcome. :-) —CYBERPOWER (Chat) 18:41, 11 May 2022 (UTC)[reply]

Errors in your robot's edits in Chinese Wikipedia[edit]

Recently, your robot didn't add a dead-url=no parameter when add archive to a live link, as a result, the original link is treated as a dead link by default.

For example: [4]

--Kcx36 (talk) 12:47, 9 May 2022 (UTC)[reply]

@Kcx36: Thank you for the report. I'm investigating a different, but related issue. I will look into this while I am at it. —CYBERPOWER (Chat) 18:52, 11 May 2022 (UTC)[reply] again[edit]

Hi! In User_talk:InternetArchiveBot/Archive/ User:Harej said the domain has been taken off the list. When I check it today in it tells me it is (still) whitelisted/permanent live. Also is whitelisted and I can't change the status. I'm now an admin at so I thought I could change the status. Do I need further user rights? --MGA73 (talk) 12:58, 13 May 2022 (UTC)[reply]

@MGA73: Domain level whitelisting and delisting requires root level access. I can give you the specific permissions so you can modify the domain states as needed. Would you like to proceed? —CYBERPOWER (Chat) 13:03, 13 May 2022 (UTC)[reply]
@Cyberpower678: Yes please. As I understand it a reason to whitelist a domain could be if the bot marks a lot of good links as dead. --MGA73 (talk) 09:51, 14 May 2022 (UTC)[reply]
@MGA73: I have given you the permissions on dawiki. Give it a try. —CYBERPOWER (Chat) 16:41, 17 May 2022 (UTC)[reply]
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. Seems to work fine. Thank you! --MGA73 (talk) 19:03, 18 May 2022 (UTC)[reply]

Duplication in nowp[edit]

Please see this diff, showing that an archive url is added to a reference that already has an archive url (both Different parameter names are used (arkiv-dato vs. arkivdato, arkiv-url vs. arkiv_url and død-lenke vs. dødlenke). - 4ing (talk) 20:35, 14 May 2022 (UTC)[reply]

@4ing: Can you confirm this is still happening in v2.0.8.7? A fix was released to address that. —CYBERPOWER (Chat) 16:44, 17 May 2022 (UTC)[reply]
No, it seems that this occured in early April. I find approx. 120 occurences that haven't been fixed. - 4ing (talk) 16:40, 18 May 2022 (UTC)[reply]
@4ing: Check this no:Special:Diff/22598700. It adds "|død-lenke=no" == DeadURL or "nei". We had similar problems on and we decided to import a new copy from --MGA73 (talk) 17:09, 18 May 2022 (UTC)[reply]
Also "haven't fixed" do you mean that the problem still exist or do you mean that bot "haven't failed" (meaning it is now working correct)? --MGA73 (talk) 17:11, 18 May 2022 (UTC)[reply]
Not fixed as in not rolled-back. - 4ing (talk) 17:56, 18 May 2022 (UTC)[reply]
Thanks 4ing. Are those all old errors or are any of those in v2.0.8.7? The problem with "DeadURL" is new. --MGA73 (talk) 19:15, 18 May 2022 (UTC)[reply]