|Please do not post any new comments on this page. This is a discussion archive first created on 01 November 2017, although the comments contained were likely posted before and after this date. See current discussion.|
Performance improvements to global abuse filters
While setting up filters for the 2017 Community Wishlist Survey (see discussion), I noticed on Meta we're regularly hitting the condition limit. It looks like some global filters could be improved, which I believe require steward intervention:
- Special:AbuseFilter/95 – This could be changed to:
old_size == 0 & user_editcount < 10 & article_namespace rlike "^(0|2)$" contains_any(added_lines, "Goji", "Gochi")
old_size == 0will cancel out more edits than the edit count check, and we can use regex (fast in this case) on the namespace to consume one condition instead of two.
- Special:AbuseFilter/88 – Similar situation to above. Consider:
old_size == 0 & user_age == 0 & article_namespace rlike "^(0|2)$" & contains_any(lcase(added_lines+article_text), ...)
- Special:AbuseFilter/82, 91 and 94 could probably be combined.
- Special:AbuseFilter/80 – Put the
old_size == 0as the first condition, as it is more likely to evaluate to false.
- Special:AbuseFilter/78 – (private, so I'm not being descriptive) The
new_pstvariable is very slow, and from my experience may not always work. You could probably just use
added_lines. Also put the
article_namespacecondition first, as it is more likely to evaluate to false.
- Special:AbuseFilter/72 and 76 – Use
article_namespace rlike "^(0|2)$"
- Special:AbuseFilter/69 – Put the
- Special:AbuseFilter/46 – You can probably get rid of the
action = "edit"condition because the other variables used (
added_links) only apply to edits. This filter would normally trip on deleting the page, but we are only checking for non-autoconfirmed users. The
added_linksconditions could also be combined into one condition, like
added_links rlike "one|two|three".
- Special:AbuseFilter/19 and 132 – Shouldn't need
action == "edit".
- Special:AbuseFilter/138 – Use
user_age == 0instead of
!"user" in user_groups(marginally faster), and put the
irlikeas the second condition and not the last. In this case, doing the regex first might slow things down a little, but it will definitely reduce the condition count. We have a new logstash dashboard that shows which filters are very slow, and we can check if this one shows up (but I don't think it will).
- Special:AbuseFilter/141 – Similar to above, use
user_age == 0as the first condition, then I'd probably put the
edit_deltacondition as the second, followed by
article_text, then the rest.
- Special:AbuseFilter/144 – I think you could use
action contains "upload"instead of the two conditions checking the action.
- I changed some of them. I may do the rest later. Ruslik (talk) 18:55, 6 November 2017 (UTC)
Spam pages on kiwiki
- Done Deleted. Ruslik (talk) 13:02, 11 November 2017 (UTC)
Page deletion in pt.wiki