Edit filters of cross-wiki interest

From Meta, a Wikimedia project coordination wiki
(English) This is an essay. It expresses the opinions and ideas of some Wikimedians but may not have wide support. This is not policy on Meta, but it may be a policy or guideline on other Wikimedia projects. Feel free to update this page as needed, or use the discussion page to propose major changes.
Translate
  • This is a collaborative essay and you are welcome to edit it taking into consideration following objectives of this essay.
  • This page is not Global AbuseFilter, which is a different concept. This page is about publicly visible non-global filters from various projects, which may prove to be useful to some other projects too.

Please do not change or transfer title of this page to Abuse filter since these filters have been doing more activities than just preventing abuse. Purpose of this page and its talk page is limited to sharing of constructive usage and benefits of Publicly visible filters only , Please do not get involved in any criticism, debates, or controversies on this page, there are ample other avenues for other activities on meta and Wikimedia sites. If you feel information of any essentially private filter is getting disclosed, please get such information deleted. This page prefers that only words and limited syntaxes which are critically important for effectivity of filters should not be disclosed. We do not need features or information about private filters since good effective practices in public filters can always be used in private filters too.

It is almost six years since March 2009, that Edit filters are in use on various Wikimedia wiki projects. The purpose of this collaborative essay is to take note of benefits of edit filters and such public edit filter codes which have been proved effective enough to any project, so other projects may benefit from this knowledge and improve their filters.

Benefits of edit filters[edit]

  • Edit filters have capacity to collect and provide right information at right time. Which mutually benefits to users and patrollers equally.
  • Edit filters are a good management tool for smart and effective patrolling, save valuable time of patrollers and sysops.
  • Edit filters function in a supportive role as effective deterrent against insult and abuse.
  • Reduces need to ban users permanently, or need to protection of pages gets reduced, need to break privacy of individuals by using checkuser, since misuse is avoided at entry level itself,( provided filter does not interfere in freedom of constructive contributions.)

Prudence[edit]

  • Need not import all the filters, needs for every wiki project may be different.
  • Sometimes, filters are needed just as deterrence, once the cause (may be spam) stops such filters can be deactivated.
  • In certain cases a group or cluster of filter using specifics for target namespace, user type, flag/tag/warn/disallow actions.

List of effective public filters[edit]

  • This section may prove to be useful to other small projects too.
Number in this table Filter name Used on wikis and links Purpose and benefits of the filter
1 Large deletion from article by new editors :en:EditFilter/30 Filter messages helps in reducing Large deletion happening by mistake from new users
(right information at right time)
2 Creating Very small articles in main namespace उदाहरण Many small wikis are over burdened with very small or empty articles, Response to this filter from new users is usually co-operative.

Unfortunately in reality bulk of small articles is actually created by experinced users (who may refuse to accept this fact unless proven otherwise) ,If issue of small article is serious to the wiki then a wiki can have separate filter for edits below 4000 bytes (roughly tow paragraph) which may trip after 6-10 small articles by autoconfirmed users in a given period which your wiki community agrees to.

3 Removal of Category :en:EditFilter/132 :en:EditFilter/117 Removal of categories often come with some other mistakes + some good edits too from good faith editors so it is always better to check such edits individually by patroller
4 Removal of Templates
5 Spam in other script or language Two types of filters are deployed script based or word based. Usually it is better practice to exempt autoconfirmed users and provide exceptions for proper usage of other languages. In most cases usually permanent filters are not needed, since once the spamster is deterred s/he would not return immediately, only one needs to keep filter readily available to reactivate if at all any spamster returns.
Language and script related deterrence through filter also automatically takes care of abuse words used in other languages. Some of these filters may be private
6 Filters to deter abusive words or abusive language constructs In this case one needs a cluster of three type of filters, Certain word formations are so unique that those can be disallowed altogether, certain word formations in rare cases may have proper usage such instances only warning is used, certain word formations with double meanings so those are either flagged or tagged.
7 Targeting Copyright abuse

First level) Verbatim copy pasting tends to bring in lot of un-encyclopedic Peacock terms, weasel words and avoidable adjectives, A close watch on excessive use of Peacock terms, weasel words and avoidable adjectives in multiple articles multiple times, can be tagged and/or intimated after tripping of filters with a rate limit of 3-4 instances in certain period, but this case needs support of patrollers or semi auto bots in removal of easily avoidable terms.
Second level) Large edits without references beyond natural human capacity to write in a given period can be calculated and targeted through a cluster. (Basic maths is writing two paragraphs -around 4000 bytes- in one's own language with proper source research and referencing and wikifications tends to take roughly minimum 75 minutes to average to 120 minutes. )

First in second level) Edits bigger than 7000 bytes without references,(tag only) tag for manual inspections by patrollers;-(Actually anything above 4000 bytes without reference and which does consist words mentioned in first level may be copyrighted text but since lesser than 7000 bytes may conflict with valid edits usual size of tables and some other type of edits)
Second in second level) 4000 bytes * writing 4 times (this means more than 16000 bytes approx. 8 paras without any reference at all that too in a short time) should need minimum 5 to 8 hours to write - still consider more time something like upto 14 hours.(ie rate limit) Give exceptions to user groups like sysops and experienced users and transfers from sandbox.
Actually this filter keeps close watch on large edits between 16000 bytes to 112,000 bytes without referencing in 4 batches in some thing like 14 hours or so. Level single stroke large edit is covered in next filter.
Third in second level) Single-stroke edits bigger than 28000 bytes without any referencing what so ever (28000 bytes non-copyright writing should take roughly 14 hours, it is highly unusual for this quantum free of copyright without reference in given time; With exceptions like if it is not transferred from sand box or it is not translation from other wikipedia by mistake without refs) Proper exceptions can be provided for in summary for reverts redirections translations and transfers from sandbox
If the wiki has supportive once in a while but regular patrolling is possible to inspect disallowed edits and restore genuine amongst disallowed then second and third filter can be used safely with disallowed on Wikipedias, If giving time for restoration is not possible then simply provide for warning and leave to the user what next to be done with that.
8 Wikipedia values awareness filter for socio politically sensitive subject articles These articles usually tend to become targets of edit wars and many of users are not aware of wiki values. This filter can come in first in range to reduce edit wars by building up awareness by providing for A random messages having reminders about specific wiki values, response can be good from co-operative users but determined users are found to be ignoring these message
9 Filters to deter edit wars Here one needs a cluster of edit filters, In regular targets of edit warred articles referencing can be made mandatory through filters giving few exceptions, and few more filters can be kept ready for serious edit warring

Useful tips for increasing effectivity of public filters[edit]

For technical help, please refer to mw:Extension:AbuseFilter/Rules format or discuss at the talk page. Use this section for those tips which are still not listed on help pages.

Edit_filter#Feature_request wish list[edit]

This is a simple wish list of feature requests (not necessarily technically correct or possible) expected to be helpful for frontline patrollers, edit filter manager/admins (not necessarily technical people) and purpose is to make their work easier, as below:

क्रमांक Feature request Benefit or Purpose Phabricator no
1 Special:AbuseLog should be filterable by action taken, e.g., "disallow" disallowing filters and disallowed edits would be monitored more effectively (for falls positives etc.) T50961
2 Special:Tags – Tagged changes to show total of only currently visible logged numbers making Special:Tags user-friendly for patrolling purposes T50369
3 Option to group filters & their logs Building and using a group of filters as deterrence capability as and when needed; Easier for monitoring and improving specific group of filters; Easier for patrolling specific group of tasks of their choice for patrollers T49531
4 additional options to rate limit trigger More effective filter management, smarter messaging which will reduce feeling of harassment on part of unconfirmed editors; will help to improve common editor's perception of edit filters as a good friendly facility. T49493
5 Ability to match text based on a negative lookbehind/lookahead regex edit filters work to the point; become more effective in controlling abuse T49495