Community Wishlist Survey 2022/Archive/Purely adding keywards on Abusefilter
Purely adding keywards on Abusefilter
Duplicate of Community Wishlist Survey 2022/Anti-harassment/Expose more detailed diff information to the AbuseFilter
- Problem: For anti-vandalism, we want to create an abusefilter to check purely adding keywords on the article, which means that the keywardA is being added by the user. Currently we need to check both of
added_lines
andremoved_lines
withcontains_any()
(e.g.contains_any(added_lines, "keywordA", "keywordB", "keywordC") & !contains_any(removed_lines, "keywordA", "keywordB", "keywordC")
or rewrite it withregex
(e.g.keywords := (keywordA|keywordB|keywordC); (added_lines regex keywords) & !(removed_lines regex keywords)
) to avoid false-positive becauseadded_lines
by editing "This keywardA is good" to "This keywardA is good, I am added" contains "keywardA" even though the edit does not add "keywardA". Such workarounds make our maintainance difficult, especially by not-so-technically-skilled users. Since it is very efficient and widely used for anti-vandalism, supporting easy-to-use function to check purely adding (and also removing) a keyward by abusefilter would be helpful. - Who would benefit: Users trying anti-vandalism
- Proposed solution: I have two ideas. Other solution idea is also welcome.
- The lighter one is that
contains_any()
supports array of keywords as its arguments (i.e.keywords := ["keywordA", "keywordB", "keywordC"]; contains_any(added_lines, keywords)
, which currently supports only variadic argumentscontains_any(added_lines, "keywordA", "keywordB", "keywordC")
. - The other one is to implement a new variable including only purely added/removed words. Note that extracting words is a little difficult on the language which does not leave space between words (e.g. CJK). This sample is one of the simplest.
- The lighter one is that
- More comments:
- Phabricator tickets:
- Proposer: aokomoriuta (talk) 02:58, 12 January 2022 (UTC)
Discussion
- The other one is to implement a new variable including only purely added/removed words. See Community Wishlist Survey 2022/Anti-harassment/Expose more detailed diff information to the AbuseFilter. --Matěj Suchánek (talk) 09:02, 12 January 2022 (UTC)
- @aokomoriuta: Thank you for participating. I'm gonna archive this proposal in favor of Community Wishlist Survey 2022/Anti-harassment/Expose more detailed diff information to the AbuseFilter which seems to cover what you are describing. Let me know if you have any concerns. DMaza (WMF) (talk) 17:36, 21 January 2022 (UTC)