Jump to content

Spam blacklist/Help

From Meta, a Wikimedia project coordination wiki

Welcome to the spam blacklist, the most exciting project on Meta. Everybody can help and any help from anybody is greatly needed and appreciated. This blacklist exists to prevent excessive placement of external links that are not seen as being of value to the community, including sites that may well not be commercial in any sense (and could even be charitable), often disruptively and with no regard for the wishes of local editors.

Here are some hints to new blacklisters.

Checking requests[edit]

Everybody can help! This is the most time-consuming and back-logged phase. Simply check the data and decide with your good sense if the proposed site has been excessively linked against the wishes of the community. If so, you can decide to request blacklisting or only to revert the edits. In any case, add your comments below the proposals even if you are doing nothing, decide to wait, or decide to do nothing; for such comments are very helpful.

  • Please use Luxo's tool to try and check the extent of the link placement. If it's placed on one or two wikis it may not be appropriate to list on Meta as local measures can be taken such as page protection or blocking.
  • Check that the link placement is current. The list is to prevent disruption to wikis - if it was some time ago then there is no disruption.
  • Please use MER-C's tool to review the number of links currently on wikis.

Bot reports[edit]

  • At the very least review the link placements on the wikis affected to see if it is relevant and wanted by the community.
    • If the links have been removed or the user/IP placing the link is warned then you may assume for the moment that it is an unwanted link. You can use Erwin's tool to check if the reported links have been removed. You then need to judge whether it is sufficiently unwanted or disruptive to warrant reverting or blacklisting.
    • If the links are not removed and appear to be relevant then "close" the report.
    • In general, where one or a few IPs (especially when they are open proxies or share a range) are pushing the link it is safe to revert the additions immediately. Similarly for new users where their only substantive edits are adding links.
  • Change the status of the report to "closed" after you have decided that it is not spam or you've added it to the blacklist. If you're not a sysop and you think it's spam please post a comment (for the next sysop) at the end of the report and leave the report open.

See also #COIBot reports below.

Adding new entries[edit]

  • Add your regex entries at the bottom of spam blacklist; if you are not sure what to do, look at the work of others, ask someone who knows, or learn about regex, or check /regex.
  • Log your action, see #Logging below
  • Archiving (see #Archiving below) need to be done after a few days (appeals often arrive fairly soon after a site is blacklisted so leaving it a few days is fine).

Logging[edit]

Adding entries

Please use exactly the following format. The log file is read by machines. So please don't use comments like "see above" or "see following entry".

For requests on Talk:Spam blacklist

  1. Take the logging snippets from both Talk:Spam blacklist and Spam blacklist after your edits. (Mark the request as done, then take the logging snippet provided there; add your regex(es) to the blacklist and then take the snippet there.)
    • For additions, use {{sbl-diff|1161261}}; see {{sbl-log|1161258#{{subst:anchorencode:Example}}}}, which produces addition; see request
    • For removals, use {{sbl-diff|1161261|removal}}; see {{sbl-log|1161258#{{subst:anchorencode:Example}}}}, which produces removal; see request
    • For changes, use {{sbl-diff|1161261|change}}; see {{sbl-log|1161258#{{subst:anchorencode:Example}}}}, which produces change; see request
  2. For single entries, write the entry and log data on the same line.
  3. For multiple entries, put the log data on a line beginning with " #:", then list the entries on new lines below indented by three spaces to set the regexes apart.

For bot reports

  • The log entry is provided to you; simply complete the {{sbl-diff|#}}, using the number taken after your edit to the blacklist.
  • Put your username in place of ADMINNAME unless you use "COIBot log tool".
  • If you use "SBHandler" (on the Gadgets tab of my preferences then this is all done for you)
Example:
entry                          # Admin          # log data
----------------------------------------------------------
spam\.com                      # Mike.lifeguard # addition; see User:COIBot/XWiki/spam.com
#:                             # A.B.           # addition; see request
  \buuf\.com\b
  \bufu\.com\b
  \bexample\.com\b
  \b[fu]{3}\.ca\b
\buuf\.com\b                   # Herbythyme     # removal; see request
spam\.com → \bspam\.com\b      # Beetstra       # change; see request

Using SBHandler[edit]

  • Enable the script on the Gadgets tab of my preferences.
  • Just hit the [add] link on the discussion section of COIBot reports & follow the instructions - the addition & log entry will be made for you. For COIBot reports, you're also provided a [reverted] option, for use when you've reverted the link additions, but blacklisting isn't necessary.
  • For additions/removals from Talk:Spam blacklist, use the [add] or [remove] links & follow the instructions - the edit & log entry will be made for you.

Appeals & removing entries[edit]

  • When appeals arrive the first thing to do is review the original listing.
    • Was a mistake made?
    • Were warnings given? In the case of links placed by changing IPs it may well be decided that the behaviour was intentional in order to avoid blocks/warnings. However for unchanging IPs or named users warnings should have been given.
  • Please delete entries if you wish to remove them (after approval of the community in some form). It is possible to "de-activate" the listing by using "#" at the start of the line however this should only be used for temporary measures in order to ensure the log does not get any larger than necessary.

Archiving[edit]

COIBot reports[edit]

Below is a manual on how to handle the COIBot reports. See also COIBot/help for the IRC command manual.

Consider[edit]

  • When there are less than 4 wikis spammed by one user, consider just reverting and closing the report except if the domain is really spammy. If there are new additions, COIBot will open the report again.
  • With more than 4 wikis spammed, have a look on how much the link is used. If it is used widely, then consider the value, and when considered spam, clean before blacklisting.
  • If one single editor is using the link suddenly, consider asking the editor to contribute to the discussion.

The database[edit]

The database started around the end of February 2008. That means that a lot of old stuff is not in the database the bots are using. This also means that there may be domains which are not used too much, which now suddenly appear as a cross-wiki problem.

What to do when there are more reports spammed by the same user/similar domains[edit]

This one report of cw is linked to others, all via e.g. same username/IP, this is what I do:

1. I choose one main report, and link all the others there.

2. The username in a UserSummary template, COIBot does not do anything with this, but it contains some nice links to other tools.




When I see this template, I will command COIBot to make the userreport ('COIBot report user John Doe', for those who are active in the IRC channels).

3. Put for all the others the linksummary-template here:






COIBot picks up the diffs of these pages, and parses the linksummary templates that are added, and from then monitors these links (with a link to the diff you are going to save). If one of the other domains related gets spammed, reports will appear quickly.

4. Then I do a 'show preview', and follow the link to domaintools for the data of the server, that states for exmaple.com: Server Data, IP-address 208.77.188.166. The same data can also be found from a COIBot report, in the top it says: "example.com resolves to 208.77.188.166". I do that for all links (well, I use my on-IRC tool built into COIBot to get the data .. but well):

  • example.com: 208.77.188.166
  • example.org: 208.77.188.166
  • example.net: 208.77.188.166

5. If the domains are the same:

  • {{LinkSummary|ip-address}}

can be added here. When these reports change, as well as some other on-wiki pages, COIBot parses the diff for added LinkSummary templates, and starts monitoring the data in the template. Our spammer can now change username, or start using his IP, COIBot will notice.

6. The COIBot reports have a life url of the IP of the server (e.g [http://208.77.188.166 208.77.188.166]. That means that these reports can be found via Special:Linksearch. I put here the IP and a linksearch for the IP also on the COIBot reports:

This is quite handy to find all reports on 'spammed' domains on this server. E.g. get very scared when you follow Special:Linksearch/72.14.207.191 (the IP of blogspot.com; most picked up by COIBot becuase <username>.blogspot.com is the same as their on-wiki username). I have asked nixeagle to have the resolved IP also automatically on these reports. Have not heard back yet.

7. Where there are multiple domains with one user, that it is better to close them all, and make one centralised discussion on Talk:Spam blacklist, this cleans the cw list, and makes the problem a bit more visible.

Sample cases[edit]

See also[edit]