Community Wishlist Survey 2022/Anti-harassment/Expose more detailed diff information to the AbuseFilter

From Meta, a Wikimedia project coordination wiki

Expose more detailed diff information to the AbuseFilter

  • Problem: The level of "diff" information accessible in AbuseFilter is too crude. Therefore, many forms of vandalism cannot be correctly captured. An example is word-swapping vandalism where the same word may exist elsewhere in the same line or paragraph.
  • Proposed solution: Some proposal has already been proposed in phab:T220764
  • Who would benefit: Wikis using AbuseFilter to fight vandalism
  • More comments: This is a rather specific proposal, so those who are unfamiliar with AbuseFilter and its limitations may not fully appreciate it. Fewer supporters may exist for this compared to some more generically defined proposals. I hope that is considered when comparing proposals.
  • Phabricator tickets: phab:T220764
  • Proposer: Huji (talk) 00:58, 11 January 2022 (UTC)[reply]

Discussion

  • I recently stumbled on an unusual interaction in the AbuseFilter extension which resulted in an accidental block. When moving a link around from one paragraph to another, the link is added to added_lines, but is not added to added_links. Improvements to AbuseFilter are most welcome! —Ivi104 02:21, 11 January 2022 (UTC)[reply]
    AbuseFilter in enwiki and some wikis have the block options disabled, as community consensus think that the filter hits should be reviewed by a human for vandalism. However, this feature is sometimes too crude and can't be watched. I also think AbuseFilter should have more types of conditions. Thingofme (talk) 02:51, 11 January 2022 (UTC)[reply]
  • As an admin who reviews filter reports a lot at AIV, I would say that the best opportunity for improvement lies with those who write the individual filters, not the extension. Daniel Case (talk) 05:28, 11 January 2022 (UTC)[reply]
  • I just wrote Community_Wishlist_Survey_2022/Admins_and_patrollers/Expose_ORES_scores_in_AbuseFilter. While there is no overlap, I believe the two proposals are closely related.--Strainu (talk) 16:21, 11 January 2022 (UTC)[reply]
    Agreed; they are closely related, and I find that to be a good idea too. The distinction is, this is something for which we have a clear path forward, but the ORES proposal has some timeliness issues for which we don't have a good answer yet. Huji (talk) 01:40, 12 January 2022 (UTC)[reply]
  • @Huji: Thank you for proposing this. In order to better understand what you are saying I have a few questions I would like to ask you. Please keep in mind that I don't write abuse filters myself:
  1. Wouldn't word swapping be easily identifiable by looking at the diff size?
  2. If the problem you are trying to solve is to better identify what was changed that triggered the filter, wouldn't looking at the diff itself be more helpful instead of sifting through the variables table?
  3. The phab task proposes a solution by adding new variables (words added/removed). How will this be useful in edits that expand multiple lines when trying to identify word swapping?
  4. Is there any other use case that is not covered by the current variables/functions that will benefit from these variables? If you can provide examples that would be great
Also, could you please update the phab task with a working link? The current one does not work anymore. DMaza (WMF) (talk) 16:55, 14 January 2022 (UTC)[reply]
@DMaza (WMF): great questions!
  1. Not necessarily. An example is the swapping of words "Kurd" and "Turk" (well, actually, the Persian words کرد and ترک respectively) which happens a lot on fawiki. They are same length words, so diff size is 0. Using logic like added_lines rlike 'Kurd' & removed_lines rlike 'Turk' won't cut it either because the rest of the paragraph could (and often does) include the words Kurd and Turk as well. We have vandals who specifically do word swaps. In a diff, we can see exactly which word was swapped (this diff shows that "a" was replaced with "some") but in AbuseFilter we don't have a corresponding variable. What I am thinking is something like added_words or added_characters, in addition to the line-level added_lines which we already have.
  2. Yes, except the diff-related variables in AbuseFilter are also at the line level, not the phrase level. Further below, I have pasted what edit_diff's value would look like for the diff I linked above; you will note that even in these variables, you still don't see the words "a" and "some" distnguished in anyway that can be used programmatically in an AbuseFilter. Essentially, we have capabilities in MW diff which we don't expose in AbuseFilter at all.
  3. In the diff example above (or the Kurd/Turk actual example) you could look for added_words contains 'Kurd' & removed_words contains 'Turk' (here I assume added_words and removed_words are arrays.
  4. Many of the use cases that are currently using added_lines or removed_lines may benefit from using these new proposed variables in addition to or instead of the existing variables. Huji (talk) 19:17, 14 January 2022 (UTC)[reply]
; edit_diff
'@@ -1,1 +1,1 @@
-Here is a word in a sentence.
+Here is some word in a sentence.
'
  • As I wrote on Community Wishlist Survey 2022/Archive/Purely adding keywards on Abusefilter, diffs in detail (e.g. words) would help us. However, splitting a sentence into words could be difficult in some languages, especially which doesn't use spaces to split words. So I worry about the quality of the algorithm. We wouldn't use the new variables for wording diffs if they give us many false positives/negatives.
    Here, I strongly recommended implementing an alternative method to accurate detection at the same time you implement the algorithm to extract word diffs. One idea is just making contains_any() support array of keywords as its arguments (i.e. keywords := ["keywordA", "keywordB", "keywordC"]; contains_any(added_lines, keywords), which currently supports only variadic arguments contains_any(added_lines, "keywordA", "keywordB", "keywordC"). This simple method is enough to detect added words with checking the diff between added_lines and removed_lines (i.e. contains_any(added_lines, "A") & !contains_any(removed_lines, "A") can detect adding word "A"). --aokomoriuta (talk) 03:59, 29 January 2022 (UTC)[reply]

Voting