Talk:Spam blacklist

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search
Requests and proposals Spam blacklist Archives (current)→
The associated page is used by the MediaWiki Spam Blacklist extension, and lists regular expressions which cannot be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any meta administrator can edit the spam blacklist; either manually or with SBHandler. For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.

Proposed additions
Please provide evidence of spamming on several wikis. Spam that only affects a single project should go to that project's local blacklist. Exceptions include malicious domains and URL redirector/shortener services. Please follow this format. Please check back after submitting your report, there could be questions regarding your request.
Proposed removals
Please check our list of requests which repeatedly get declined. Typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects. Please consider whether requesting whitelisting on a specific wiki for a specific use is more appropriate - that is very often the case.
Other discussion
Troubleshooting and problems - If there is an error in the blacklist (i.e. a regex error) which is causing problems, please raise the issue here.
Discussion - Meta-discussion concerning the operation of the blacklist and related pages, and communication among the spam blacklist team.
#wikimedia-external-linksconnect - Real-time IRC chat for co-ordination of activities related to maintenance of the blacklist.
Whitelists
There is no global whitelist, so if you are seeking a whitelisting of a url at a wiki then please address such matters via use of the respective Mediawiki talk:Spam-whitelist page at that wiki, and you should consider the use of the template {{edit protected}} or its local equivalent to get attention to your edit.

Please sign your posts with ~~~~ after your comment. This leaves a signature and timestamp so conversations are easier to follow.


Completed requests are marked as {{added}}/{{removed}} or {{declined}}, and are generally archived quickly. Additions and removals are logged · current log 2021/01.

Projects

snippet for logging
{{sbl-log|20979722#{{subst:anchorencode:SectionNameHere}}}}

Proposed additions[edit]

Symbol comment vote.svg This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users on multiple wikis. Completed requests will be marked as {{added}} or {{declined}} and archived.

spammed search url[edit]



Regex requested to be blacklisted: \brt\.com/search Search term being link spammed by spambots.  — billinghurst sDrewth 08:38, 11 January 2021 (UTC)

qr.ae[edit]



URL shortener for Quora. There are some usages on English wiki, and looks like a handful in the top 20, enough to stop it early. Ravensfire (talk) 19:02, 14 January 2021 (UTC)

@Ravensfire: Question, is it being abused, or is it a domain where we have had abuse? Typically where they are non-dangerous and dedicated redirects they have been left alone, eg. Washington Post. Is there a requirement to focus people through the main domain name for continuity or consistency.  — billinghurst sDrewth 22:06, 14 January 2021 (UTC)
@Billinghurst: I haven't seen anything significant. About the only potentially harmful use would be if specific Quora pages were blacklisted, this would be a work-around, similar to youtu.be. Actual harm right now, I don't see any. I guess my habit is to list shorteners when I see them, especially if they are generic. Ravensfire (talk) 22:36, 14 January 2021 (UTC)
I hear you, finding the balance is always the fun thing. I am not adverse to doing it as required, though would want to see a consensus to do so, rather than based on a single request. I will leave it open and see what appears over the next while.  — billinghurst sDrewth 22:46, 14 January 2021 (UTC)

Proposed additions (Bot reported)[edit]

Symbol comment vote.svg This section is for domains which have been added to multiple wikis as observed by a bot.

These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see Spam blacklist/Help#COIBot_reports. Reports will automatically be archived by the bot when they get stale (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is COIBot).

Sysops
  • If the report contains links to less than 5 wikis, then only add it when it is really spam
  • Otherwise just revert the link-additions, and close the report; closed reports will be reopened when spamming continues
  • To close a report, change the LinkStatus template to closed ({{LinkStatus|closed}})
  • Please place any notes in the discussion section below the HTML comment

COIBot[edit]

The LinkWatchers report domains meeting the following criteria:

  • When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
  • When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
  • If ALL links are added by IPs, and the link is added to more than 1 wiki
  • If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.
COIBot's currently open XWiki reports
List Last update By Site IP R Last user Last link addition User Link User - Link User - Link - Wikis Link - Wikis
aka.ms 2021-01-21 19:00:06 COIBot 23.199.10.244 R Agusbou2015
Darrenjefford
Mhamilton723
2070-01-01 05:00:00 221 23
androidicto.com 2021-01-21 17:12:23 COIBot 198.20.125.109 31.4.245.175
83.34.16.214
2070-01-01 05:00:00 5 2
anne-marie.eu 2021-01-21 18:26:28 COIBot 141.138.169.228 83.128.26.42 2070-01-01 05:00:00 7 6 0 0 2
ante-rokov-jadrijevic.blogspot.hr 2021-01-21 14:42:16 COIBot 172.217.9.193 R Dvanaesti Igrač
François-Ávila
2070-01-01 05:00:00 4 2
ante-rokov-prometeus.blogspot.hr 2021-01-21 14:41:01 COIBot 172.217.9.193 R Dvanaesti Igrač
François-Ávila
2070-01-01 05:00:00 4 2
betadarou.com 2021-01-21 15:02:19 COIBot 49.12.129.143 Nakhand2 2070-01-01 05:00:00 8 7 0 0 2
daliteratura.blogspot.pt 2021-01-21 11:10:30 COIBot 172.217.9.193 R Mikceo 2070-01-01 05:00:00 2507 3 0 0 2
das-bild-des-orients.info 2021-01-21 12:39:15 COIBot 80.67.17.109 DBdO 2070-01-01 05:00:00 12 11 0 0 2
isg-luxury.fr 2021-01-21 10:42:06 COIBot 51.83.109.163 2A01:CB00:B51:3E00:6C4B:F2A:7A05:5CFC 2070-01-01 05:00:00 1 5 0 0 3
newsaffinity.com 2021-01-21 17:08:07 COIBot 104.21.5.3 Apoorvgupta1619
Fijigirl333
Macr22222
Nklepper
Tomilku
185.248.12.253
185.248.13.25
27.34.26.187
2070-01-01 05:00:00 20 3
slynt.blogspot.se 2021-01-21 12:09:51 COIBot 172.217.9.193 R ArmSirius 2070-01-01 05:00:00 2521 3 0 0 2
zolotivorota.com.ua 2021-01-21 11:59:32 COIBot 185.224.138.137 176.38.62.117
185.181.246.95
185.181.247.130
46.8.110.4
2070-01-01 05:00:00 6 6

Proposed removals[edit]

Symbol comment vote.svg This section is for proposing that a website be unlisted; please add new entries at the bottom of the section.

Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{removed}} or {{declined}} and archived.

See also recurring requests for repeatedly proposed (and refused) removals.

Notes:

  • The addition or removal of a domain from the blacklist is not a vote; please do not bold the first words in statements.
  • This page is for the removal of domains from the global blacklist, not for removal of domains from the blacklists of individual wikis. For those requests please take your discussion to the pertinent wiki, where such requests would be made at Mediawiki talk:Spam-blacklist at that wiki. Search spamlists — remember to enter any relevant language code

casino.ru[edit]



As it was mentioned in Talk:Spam blacklist this domain was blacklisted because: ‘Spammed by numerous IPs on Russian and Ukrainian Wikipedias. --Mercy 15:33, 23 December 2009 (UTC)’. I found this site useful for gambling articles. There are original interviews, articles and news. F.e. Gambling in Ukraine – the article in english is poor and need a lot of work and references from native speaker’s sites will be good. https://en.wikipedia.org/wiki/Craps https://en.wikipedia.org/wiki/Casino_token https://en.wikipedia.org/wiki/Gambling_in_Ukraine https://en.wikipedia.org/wiki/Gambling_in_Macau https://en.wikipedia.org/wiki/Gambling_age https://en.wikipedia.org/wiki/Gambling https://en.wikipedia.org/wiki/Gaming_law

Sort of an EP per User talk:SmurFF2020. Camouflaged Mirage (talk) 12:04, 28 December 2020 (UTC)
Comment Comment Keen to defer to en for local whitelisting though, as globally there is still some undesired impact, @Ohnoitsjamie: is it possible to whitelist locally, yes, it cannot be removed from en blacklist as it's on meta but there is a possibility of using local whitelist. I am uncomfortable to remove it globally per just some pages in one wiki needing it as it's clearly spammy (and true on ru/uk wp) Camouflaged Mirage (talk) 12:07, 28 December 2020 (UTC)
@SmurFF2020: Best that you ask at w:mediawiki talk:spam-whitelist and ask for local whitelisting. Asking there will create a local record and enable a local conversation.  — billinghurst sDrewth 12:10, 28 December 2020 (UTC)
I see no reason to remove it globally. The user who requested removing it from the en blacklist has no edits there; we rarely whitelist for new users, as it usually suggest the strong possibility of a WP:COI. Ohnoitsjamie (talk) 19:14, 28 December 2020 (UTC)

1000fragen.de[edit]



imho the blacklisting was done in error, because there is no indication for spamming.
1000fragen.de is a redirect now. but it wasn't a redirect some time ago. as the domain is blacklisted, this now leads to problems, if somebody wants to add an archived url.

so i'm going to unblacklist that domain, now. -- seth (talk) 16:21, 29 December 2020 (UTC)

@Lustiger seth: I disagree with the action, as it was spammed, and will now only be spam anywhere else. That will not have been the only instance at that time. I would suggest that you whitelist at your wiki instead.  — billinghurst sDrewth 21:27, 29 December 2020 (UTC)
hi!
where is/was the spam? at User:COIBot/XWiki/1000fragen.de i only see only link addition.
if there is (at least) a moderate risk of further spamming, then we can go that way of whitelisting the domain at dewiki. but what are the indications for spamming in this case? -- seth (talk) 22:58, 29 December 2020 (UTC)
It was +++ months ago, so having to remember specifics one from many won't be happening. It is just not my practice to blacklist based on one edit, unless a site it is truly problematic. COIBot never shows them all, unless it meets the formula, and global-search becomes the check. I have set the domain to monitor and will come back if we are having problems.  — billinghurst sDrewth 01:26, 30 December 2020 (UTC)
ok, thanks. let's wait and see. (next time it would be great if you could be a bit more precise when writing the reason for blacklisting at the corresponding XWiki page.) -- seth (talk) 09:39, 30 December 2020 (UTC)

goodtherapy.org[edit]



This blacklisting is quite unfortunate for Wikipedia. I have started creating articles in psychology for the French Wikipédia, and this site would be immensely useful. I would, for example, need to use references from goodtherapy for articles about Murray Bowen, Virginia Axline and Family Systems Therapy. Their articles appear clear, concise and thorough, which are invaluable advantages for our work on Wikipedia. Thank you for considering my humble request. --Liberlogos (talk) 20:16, 2 January 2021 (UTC)

@Liberlogos: It was blacklisted years ago at the request of the Wikipedias. I suggest that you talk to the admins at both sites to see if they will whitelist it. frwp: Defer to w:fr:Mediawiki talk:spam-whitelist and enwp: Defer to w:en:Mediawiki talk:spam-whitelist We are unlikely to remove without other positive comment.  — billinghurst sDrewth 13:56, 3 January 2021 (UTC)

Discussion[edit]

Symbol comment vote.svg This section is for discussion of Spam blacklist issues among other users.

New script[edit]

Since I recently came across a situation where I had to track down which line in a spam blacklist caused a hit, and since it was fairly tedious even with mediawiki.org's short local blacklist, I wrote a script to automate checking all of the lines of the blacklist to find a match: User:DannyS712/FindBlacklistEntry.js. I'll probably move it to meta and write documentation at some point, but I hope this is helpful to anyone else that deals with unexpected hits - it checks both the local and global blacklists. Let me know if there are any questions. Thanks, --DannyS712 (talk) 19:31, 16 December 2020 (UTC)

Thanks for letting us know. I shall take a look at it :-) —MarcoAurelio (talk) 16:33, 17 December 2020 (UTC)
Documentation now at User:DannyS712/FindBlacklistEntry DannyS712 (talk) 03:00, 18 December 2020 (UTC)
hi!
i guess, you could have also just used https://searchsbl.toolforge.org/ to find the corresponding entry.
however, it might be nice to have a more internal tool (such as yours). -- seth (talk) 16:30, 29 December 2020 (UTC) 22:59, 29 December 2020 (UTC)
👍👍👍 It is my go-to tool Lustiger seth. Coupled with COI's IRC "findrules" and "wherelisted".  — billinghurst sDrewth 21:21, 29 December 2020 (UTC)

wiki pages about spammed urls (e.g. aliexpress.com)[edit]



as the domain is fully blocked, even at de:AliExpress there is no link. possible solutions:

  1. use the edit filter instead -> disadvantage: more complicated maintenance
  2. use \baliexpress\.com\b. (i.e. require a char after domain name for blocking, s.t. only deep links are blocked) -> disadvantage: spammers could still use links to the main page https://aliexpress.com
  3. variation of 2. by local whitelisting \baliexpress.com(?!.) -> same disadvantage as 2., but only locally.
  4. temp unblock aliexpress.com, add the link to the wiki article, then reblock the domain -> disadvantage: if anybody deletes the url, the re-addition might get to complicated for normal users.
  5. variation of 4. by temp local whitelisting -> disadvantage: this might confuse admins, cause it looks like the domain was blacklisted and had been be added though.

the second option is be fine in some cases, but i'm not sure, whether aliexpress.com is such a case. so at the moment i tend to 4., i.e. temp unblocking here.
other opinions? other options? -- seth (talk) 08:38, 5 January 2021 (UTC)

@Lustiger seth: You have to take into account that option 1 is an enormous strain on the server for the humongous number of regexes we have. Options 2 and 3 are indeed a problem if the main page gets spammed (pornhub.com is there the prime example, schoolkids have fun with changing the webpage of their school to that domain, an action which happens a regularly (en:Special:Log/spamblacklist/153.33.4.30), and it doesn't block the 'buy my product [https://aliexpress.com here]'-type of spam (spammers spam to make money, they go at lengths to get it). Options 4 and 5 are indeed a solution, but difficult to maintain (repair becomes frustrating).
The option we have chosen on en.wikipedia is to find a neutral landing page and whitelist that. Generally we ask for the about, or similar. It would be better in reverse by excluding that from the global blacklist and make it global. That avoids the root-spam, and most spammers and schoolkids will not figure out that they can use pornhub.com/about, and that is less effective/offensive than the front page, and if they do that is more reason for harsher sanctions. This is one of the frustrations why I wrote this. --Dirk Beetstra T C (en: U, T) 10:44, 5 January 2021 (UTC)
yes, an about page would be good. unfortunately i don't find such a page at aliexpress.com. afaics they don't even have an imprint. :-/
concerning pornhub: i excluded pornhub.com/information globally now from being blocked. -- seth (talk) 12:52, 6 January 2021 (UTC)
@Lustiger seth: Me neither now I look for it. But 'aliexpress.com/index.html' would serve the purpose here, not very obvious and taking you where you want. Note, for the tech-savvy under us, if you exclude the root, I can use every single document on aliexpress.com that I could possibly want to link to (and would be blocked in minutes for those admins that know where to look, but one could stay under the radar for a bit). --Dirk Beetstra T C (en: U, T) 13:01, 6 January 2021 (UTC)
gudn tach!
index.html -> good idea. i used that now.
but i don't understand your last sentence. -- seth (talk) 23:07, 6 January 2021 (UTC)
@Lustiger seth: there are ways around the blacklist, and with a rule like \baliexpress\.com\b. becomes easier (ánd it allows, as you say, the spamming of the root). I will not go further per en:WP:BEANS. --Dirk Beetstra T C (en: U, T) 12:19, 7 January 2021 (UTC)
ok, i guess, i know what you mean. -- seth (talk) 20:39, 7 January 2021 (UTC)