Talk:Community Wishlist/Wishes/A way to filter edits by Citation bot, InternetArchiveBot and OAbot from diffs
Add topicThis page is for discussions related to the Community Wishlist/Wishes/A way to filter edits by Citation bot, InternetArchiveBot and OAbot from diffs page.
Please remember to:
|
![]() |
Existing script
[edit]I'm aware of en:User:Nardog/RCMuter, does this cover your use case (at least for watch lists)? Sjoerd de Bruin (talk) 13:38, 7 August 2024 (UTC)
- Thanks this is useful for hiding articles with only these edits from the Watchlist. It does not cover the case described here because it does not remove these user's edits from diffs. If the proposal is unclear, maybe the example diffs linked in it can help explain it.
- By the way, the usefulness of the feature proposed here is largest when there are many edits in a diff where it's already time-intensive and cluttered to go through all the changes even if bot edits were excluded. Prototyperspective (talk) 11:20, 12 August 2024 (UTC)
- I asked on the talk page of your link if the tool could be used / adjusted to hide bot edits from the Watchlist without hiding prior human edits (see section below) which is not what this proposal is about but would be useful in combination with it: it can't. Prototyperspective (talk) 12:48, 19 November 2024 (UTC)
Progress into focus area
[edit]Hello Prototyperspective, thank you for your submission, and thank you for joining the discussion of other submissions. You make the Community Wishlist better.
Please note we have added this wish to a focus area named “Make it easier for patrollers and other editors to prioritize tasks”. Please review the focus area and give feedback on the talkpage. If you are also ready to support this focus area, you may proceed to vote, remember to log in first and then use the support button. Thank you for submitting this wish once again. –– STei (WMF) (talk) 15:01, 19 August 2024 (UTC)
Excluding bot edits from the Watchlist doesn't work or does it?
[edit]I thought about adding the following to the proposal as a further problem this may address:
Another issue this may help address at some point is that bot edits can hide other edit summaries from the Watchlist. For example, when adding a category to an article and shortly thereafter a well-known bot edits the page, the edit summary of the cat addition does not show up on the Watchlist which reduces the visibility of edits that are constructive to check or read the edit summary thereof. People can even intentionally prompt bots like User:Citation bot to edit the page right after their edit so their change will go unnoticed (especially since there is no unseen edit count but only the latest edit summary on the Watchlist). This may also apply to Talk pages where posts are already barely read and get even less views if a bot like c:User:SpBot archives some older talk page posts shortly afterwards.
However, one can already exclude bot edits from the Watchlist. The problem is that it doesn't seem to work and neither for filtering out AWB edits which is another type of edit I'd like to filter out: instead of showing the latest non-bot non-AWB edit, the respective page is not shown on the Watchlist at all! This means people can just get a bot to edit a page to prevent their change to be checked by people who filter these edits out alongside making it much less visible. Shouldn't the latest non-AWB edit be shown on the Watchlist instead? Is there a phabricator issue about this and should the above text that I intended to add here be added with the info that filtering out bot edits is broken? For example, I filtered out AWB edits then refreshed the Wikipedia Watchlist – it does not show the recent AWB edit on the Amino acid page but it also doesn't show the latest normal edit from 6 August – page "Amino acid" is not shown in my Watchlist anymore at all.
Could somebody please clarify?
@Keith D: this may also be relevant to your proposal about filtering out structured data edits, maybe the same problem exists there. Prototyperspective (talk) 14:26, 21 August 2024 (UTC)
- The change would probably cause the same problem as the watchlist only shows the latest change that is relevant. Hiding any types of edit may not show a change to an item when in fact there is a change. May be the watchlist should show the latest change, that is not excluded by the hide options, since your last visit to the page. Could also indicate the number of changes between your last visit and the current state that are not excluded. Keith D (talk) 19:54, 2 September 2024 (UTC)
- May be the watchlist should show the latest change, that is not excluded by the hide options, since your last visit to the page. Yes that's what I thought it would and what I meant would be shown on the Watchlist per this proposal (the hide options here being which bot edits are excluded from diffs). I didn't mean this would solve the problem, probably I'll make a separate proposal about that and I'm really very surprised this is currently the case as people can do lots of problematic editing and then just prompt a bot to edit the page to make their changes hidden from watchlist and go unchecked. It makes this whole hide bot edits feature useless except for Wikimedia projects (mainly WMC in particular) where files get edited so rarely that most of the time it's just one edit and thus hiding bot edits from the watchlist really just hides bot diffs. Number of changes would be useful but I think it needs a button to see a diff of all changes since last checked with one click so this counter wouldn't be that useful when one can just click that anyway. Prototyperspective (talk) 22:48, 2 September 2024 (UTC)
- I submitted an issue about the problem of hiding bot edits also hiding human edits at phab:T375705. Prototyperspective (talk) 23:35, 25 September 2024 (UTC)
- May be the watchlist should show the latest change, that is not excluded by the hide options, since your last visit to the page. Yes that's what I thought it would and what I meant would be shown on the Watchlist per this proposal (the hide options here being which bot edits are excluded from diffs). I didn't mean this would solve the problem, probably I'll make a separate proposal about that and I'm really very surprised this is currently the case as people can do lots of problematic editing and then just prompt a bot to edit the page to make their changes hidden from watchlist and go unchecked. It makes this whole hide bot edits feature useless except for Wikimedia projects (mainly WMC in particular) where files get edited so rarely that most of the time it's just one edit and thus hiding bot edits from the watchlist really just hides bot diffs. Number of changes would be useful but I think it needs a button to see a diff of all changes since last checked with one click so this counter wouldn't be that useful when one can just click that anyway. Prototyperspective (talk) 22:48, 2 September 2024 (UTC)
Make more general
[edit]May be you could consider making this more general so that entries can be added or removed from the list of exclusions for the watchlist. BOTs change over time and you may get a new one that you would like to exclude. Some users may want to see a specific BOT, but not others. Keith D (talk) 20:49, 19 October 2024 (UTC)
- Thanks for this suggestion – it's a good point. However, I think the proposal already is as broad as you describe. See There may be a couple additional bots one may consider adding to the list of bots whose changes can be excluded like GreenC bot, WikiCleanerBot, and maybe ClueBot.
- It would have these bots as the default setting and then allow users to specify bots or maybe to hide all bot edits. That could be done later because it may be best to keep it simple and extra efforts like allowing users to specify which bots to exclude could make this substantially more complex while the main value or use is excluding these bots. That is why it's not fully part of this proposal: to make it as easy as possible to implement and later improvements would go into subsequent issues. Maybe that part of the proposal could be clearer and something of what you suggested should be added to the quoted part. Prototyperspective (talk) 21:26, 19 October 2024 (UTC)
Implementation details ...
[edit]The diff engine shows the difference between 2 revisions - i.e. 2 strings of text. In order to do what you ask we'd need to change to diffing between individual revisions and then combining those diffs, which is a very big change
... and even if we did that I'm not sure this makes sense. If an edit is made by a bot, and then a subsequent edit is made to the bot-edited text by a human, how can we show an accurate diff without including the bot edit?
CParle (WMF) (talk) 11:28, 30 May 2025 (UTC)
- (replaced prior comment: seems like I mostly misunderstood and in any case it needs some other clarifications since a button to see the diff from the revision last seen to the latest revision is not needed for this wish)
- If the diff engine shows the difference between two diffs one way could be to somehow add the changes made by bots to revision 1 so that when diffing to revision 2 it doesn't highlight these. Maybe some other revision control project has solved this already – if I can think of a more concrete way this could be implemented or find such a solution, I'll add it here.
- Eventually, it would show the change made by the bot as if it was done prior to the first selected diff. It would just not highlight these changes basically but only those made subsequently by humans. In other words, the diff by the bot is included in the resulting wikitext but not in the diff view.
- Maybe one could also substract the bot changes from the diffs by checking the included diffs for an isBot variable and substracting each of these from the diff-view the user is viewing.
- --Prototyperspective (talk) 14:29, 30 May 2025 (UTC)
- For since last seen diffs there would be a clearer way to implement this to some extent: if the bot edit is the first unseen or the most recent unseen edit. In that case one, one could compare from the revision after and/or before the bot edit. This would be useful to those who frequently check their watchlist since for those who check it less frequently, bot edits would usually be somewhere in between. Prototyperspective (talk) 18:25, 30 May 2025 (UTC)
- I'm still struggling to see how to generalise this. Say I have the following series of edits:
- 1: One
- 2 (bot): Two
- 3: Too
- Would diff highlighting look like this?
One | Too
- I think that's kinda confusing tbh. And what if there was another bot edit?
- 4 (bot): Twelve
- What would the diff look like then? CParle (WMF) (talk) 15:07, 18 June 2025 (UTC)
- It would be
Maybe it would make sense to in diffs just ignore any changes in ref templates like Cite Web (later maybe extend the types of bot changes to ignore).Two| Too - Other example:
1.One<ref>{{Cite web|title=Bla|url=abc.com/123}}</ref>
(bot) 2.One<ref>{{Cite web|title=Bla|url=abc.com/123|scid=f37nws2|archive-url=wb.com/ab12}}</ref> - 3.On Two,<ref>{{Cite web|title=Bla|url=abc.com/123|scid=f37nws2|archive-url=wb.com/ab12}}</ref>
- This would show some diff like:
(where the bot's changes in the ref template are not included).OneOn Two,<ref>{{Cite web|title=Bla|url=abc.com/123|scid=f37nws2|archive-url=wb.com/ab12}}</ref> - All the bot changes would be excluded/subtracted. However, regarding my last comment there is a clearer way for those diffs where the oldest and/or newest edit is a bot edit: for these, the diff would simply exclude that diff. That's not nearly as useful as excluding also bot edits that are in between. Prototyperspective (talk) 16:34, 18 June 2025 (UTC)
- Just wanna say up front - this would need a radical rewrite of the way we do diffs, and the end result would be much more complex and resource-intensive, so we'd need very widespread community support before we'd consider taking it on
- ... and that's assuming that it's possible at all. It's easy enough to imagine what to do about wikitext that is only modified by a bot, but I'm still struggling with changes that are made by a bot that are in turn modified by a human editor (and then maybe modified by a bot again, and a human again, etc)
- Can you tell me what the diff should look like for my example above between revisions 1 and 4? CParle (WMF) (talk) 10:32, 19 June 2025 (UTC)
- Yes I imagined it would be easier when I wrote the proposal so thanks for looking into it and pointing out the difficulty. Maybe somebody has a clever idea how this could be implemented. Because if it can be, it would be very very useful. Maybe some other version control project has the problem of excluding a particular type of changes in between already solved.
- I would imagine one way would be to check all the included diffs for a bot-flag and then somehow and I don't know how exactly remove the changes made in these diffs from the diff view which would already be useful if it worked in all cases (and here also note again that bots like those mentioned don't really edit the wikitext content but just the ref parameters – it could be implemented in some hacky way like virtually 'downmerging' the changes they did to the first revision so they don't show up highlighted in the diff view as if those parameters were already at that place in the first revision of the diff).
- I think the diff would look like
. Except if one implements the exclusion of the last and/or first diff if it's a bot edit but that's not nearly as useful as any other way as it's rare for many-changes-diffs to have the bot edit at last or first. Then it would beTwo| Twelve
(keep in mind that this would be just some text in the ref template, not the wikitext content). Prototyperspective (talk) 12:29, 19 June 2025 (UTC)Two| Too- "Maybe some other version control project has the problem of excluding a particular type of changes in between already solved."
- If you find one that has then I'd love to hear about it!
- "Maybe it would make sense to in diffs just ignore any changes in ref templates like Cite Web"
- This would be even more difficult, as we'd need the diff engine to remove certain patterns from the "before" and "after" revisions before comparing them :/ CParle (WMF) (talk) 13:25, 19 June 2025 (UTC)
- So in the meantime - what should we do with this wish? I feel like it's still unclear how to handle text that has been modified by users and bots, and show that in a way that makes sense, and then also this is technically difficult
- Maybe I could mark it as "blocked" while you have a look and see if you can find another version control project that solves this? And maybe we'll revisit in a month and archive if not? CParle (WMF) (talk) 13:30, 19 June 2025 (UTC)
- Setting the status to blocked sounds good. I think such wishes get a short clarification text that explains what the blocking issue is. I think it would be best to just keep it under the blocked status – maybe there could be some table filters / sorting that would allow filtering away blocked issues but it wouldn't be a problem if they show in the table so I don't think it should be archived but I'll try to find a way this could be implemented. If there's really no good/viable way to implement it, then the Archived status makes sense but I think it's quite possible that there is a way but it's not so easy to come up with it and develop it. (Another example: imagine if above every yellow/blue highlighting in the diff view the username(s) of those who edited it were displayed – changes could then be excluded via that except for segments that were edited by multiple where it's difficult to disentangle who edited which part.) Prototyperspective (talk) 20:41, 19 June 2025 (UTC)
- It would be
- For since last seen diffs there would be a clearer way to implement this to some extent: if the bot edit is the first unseen or the most recent unseen edit. In that case one, one could compare from the revision after and/or before the bot edit. This would be useful to those who frequently check their watchlist since for those who check it less frequently, bot edits would usually be somewhere in between. Prototyperspective (talk) 18:25, 30 May 2025 (UTC)