Talk:Spam blacklist: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Latest comment: 7 years ago by Syum90 in topic Proposed additions
Content deleted Content added
Replaced content with "UNDER ARMOUR ROCKS"
m Reverted changes by 201.239.65.177 (talk) to last version by Syum90
Line 1: Line 1:
{{autotranslate|base=Template:Spam blacklist header}}
UNDER ARMOUR ROCKS

<!-- {{none}} comment this out when false :) -->

==Proposed additions==
{{messagebox
|text=This section is for proposing that a website be blacklisted; add new entries at the ''bottom'' of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). '''Provide links demonstrating widespread spamming by multiple users on multiple wikis.''' Completed requests will be marked as {{tl|added}} or {{tl|declined}} and archived.
}}

===several shocksites/screamers===
* {{LinkSummary|1man1jar.com}}
* {{LinkSummary|findminecraft.com}}

1man1jar, the 1man1jar mirror (jarsquatter), and findminecraft (which is a screamer)
Links are innapropriate for use. [[Special:Contributions/96.237.20.248|96.237.20.248]] 19:12, 17 January 2016 (UTC)
:{{rto|96.237.20.248}} Can you please paste the domainnames in a linksummary template, so for the domain 'example.com' you would put them here (each on one line) '{tl|LinkSummary|example.com}}'. Note that just being inappropriate is not necessarily sufficient reason to blacklist them, we'd need evidence of abuse as well. Thanks! --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 03:31, 18 January 2016 (UTC)

1man1jar.com and findminecraft.com (DO NOT CLICK); evidence of abuse [https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Spam/LinkReports/1man1jar.com here] and pretty much just that. Cross out findminecraft.

=== netflix spammer ===
* {{LinkSummary|facebook.com}}
* facebook.com/Netflixsteraming100

It may even be worth to consider 'steraming' typo varieties. [[User:MER-C|Ping MER-C]]. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 03:43, 13 March 2016 (UTC)
:I don't particularly care about spam on Facebook, even if it appears on Wikimedia sites. Please report it if you have an account (I don't) as links to pirated TV shows are likely against their TOS. [[User:MER-C|MER-C]] ([[User talk:MER-C|talk]]) 08:35, 14 March 2016 (UTC)
:{{rto|MER-C}} What there is on Facebook I don't care either (I do have an account) .. but if links to Facebook get spammed here like with this link, maybe the whole could go on the blacklist. I pinged you at first because of the 'typosquatting': steraming .. worth just blocking that whole word and taking out more than just this? --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 08:39, 14 March 2016 (UTC)

* {{IPSummary|206.54.180.4}}
* {{IPSummary|180.253.255.68}}
* {{IPSummary|206.54.180.4}}
* {{IPSummary|103.27.222.89}}
* {{IPSummary|36.84.67.241}}

Some IPs that XLinkBot caught. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 08:43, 14 March 2016 (UTC)

Hmm .. also permalinks into facebook .. https://en.wikipedia.org/w/index.php?title=Ex_on_the_Beach_(series_4)&diff=prev&oldid=709006958 .. difficult to weed out and block all of those. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 08:47, 14 March 2016 (UTC)

=== browse-read.com ===
* {{LinkSummary|browse-read.com}}

Stumbled accross this in an OTRS complaint (Ticket#2016030310023685) as someone had removed a link to a seemingly valid page. Searching a little to see if the site was legit I found that only some pages at <code>site:browse-read.com</code> was available, and searching some more I found that some of the pages was the usual "hot pics" and not only "hot vegetables". The hot pics seems to have disapeared since. I removed the link at [https://no.wikipedia.org/w/index.php?title=Carolina_Reaper&type=revision&diff=15875136&oldid=15834410 nowiki], but was reverted at [https://en.wikipedia.org/w/index.php?title=Naga_Viper_pepper&type=revision&diff=709619226&oldid=708240427 enwiki]. I removed one to many links at enwiki, sorry.

Looking at the site I wonder if this is an up and coming linkfarm that has a legit front page and only lets users navigate to other legit pages. It can also be an old domain that is reused for a new site, and that is the reason for the strange hits in search engines. It could be interesting to identify whats really going on with this site, but I don't have the time to further investigation. — ''[[User:Jeblad|Jeblad]]'' 14:49, 13 March 2016 (UTC)

===seosprint.net===
{{spamlink|seosprint.net}}

seo-site working method bringing referrals. The site does not contain any useful information, but the risk add spam links. --[[User:Максим Підліснюк|Максим Підліснюк]] ([[User talk:Максим Підліснюк|talk]]) 01:30, 11 May 2016 (UTC)

:{{Ping|Максим Підліснюк}} Hi, as far I can see it is a single wiki issue, so please [[w:ru:Википедия:Изменение спам-листа|request local blacklisting]] first. Regards.--[[User:Syum90|Syum90]] ([[User talk:Syum90|talk]]) 16:23, 10 August 2016 (UTC)

=== bilder-hamburg.info ===

{{linksummary|bilder-hamburg.info}}
* Spammed on: many different (also smaller) Wikipedias (maybe also WikiVoyage?)
* Topic: tourist features in Germany (and especially Hamburg)
* no user name
* many different dynamic IP addresses. Examples:
{{ipsummary|2003:62:5F45:DF11:591E:FB77:FCB4:AD02}}
{{ipsummary|2003:62:5F45:DF11:A0F0:ED88:4B18:C5C6}}
{{ipsummary|2003:62:5F5F:264C:5894:F19A:737B:1B84}}
There are more than plenty pictures of those POIs at Wikimedia Commons. We do not need to link extensively to a single website for such common images. -- [[Special:Contributions/77.6.13.233|77.6.13.233]] 21:14, 28 April 2016 (UTC)

:While I understand why this has been brought here, I am loathe to act when the individual wikis have not. The domain is not added to any blacklists, and the linking xwiki is along the lines of <code>279 records; Top 10 wikis where bilder-hamburg.info has been added: w:en (23), w:zh (9), w:sv (8), w:ru (8), w:ja (7), w:pt (6), w:ko (6), w:nl (5), w:no (5), w:tr (5).</code> I would like to see a larger commentary, or clear indication that there is abuse by seeing the wikis removing these links. &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 15:05, 23 May 2016 (UTC)

=== translate.google.[a-z]{2,5}/translate und translate.googleusercontent.com/translate ===
* {{LinkSummary|translate.google.com}}
* {{LinkSummary|translate.googleusercontent.com}}

Hi!<br />
There have been already several discussions on that topic (see [[Talk:Spam_blacklist/Archives/2008-02#WebWarper]], [[Talk:Spam_blacklist/Archives/2013-04#Google_Translate_as_a_universal_URL_redirector]]), but I think it won't do any harm to discuss again. :-)<br />
As [[w:de:MediaWiki_Diskussion:Spam-blacklist#translate.google..5B.5E.5C.2F.5D.7B2.2C5.7D.2Ftranslate_und_translate.googleusercontent.com.2Ftranslate|demonstrated at dewiki]] it is possible to use the google-translator to circumvent the sbl. user:Boshomi stated that in at least 2 cases the google-translator was used with blacklisted urls. Still I'm not sure, whether the use of such translations on talk pages might be bigger than the risk of sbl circumvention. I guess at least in dewiki in main namespace the google translations are unwanted.<br />
Maybe in future there should be developed a mediawiki extension for automatic translations that does not depend on just one (google) translator. -- [[User:Lustiger seth|seth]] ([[User talk:Lustiger seth|talk]]) 06:40, 15 June 2016 (UTC)

:{{rto|Lustiger seth}} our general rule on url shorteners is to blacklist on sight (even before use occurs). IMHO, there is in mainspace no reason to link ever to these services (you link to the data, and then click 'translate' yourself if that is necessary - I would get a bit cross if I got automatically redirected to an English translation of a document in Dutch, German, Frysian, Italian .... It is not to the editor to decide which translation I should use, and if they used a translation to reference an article then that is a dangerous practice in itself). And actually that is similar outside of mainspace. I think this could be blacklisted similar to the /url? link of google search results pages. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 11:05, 15 June 2016 (UTC)

=== state.gift ===
* {{LinkSummary|state.gift}}
copy of wikipedia, but it ignors the terms of licences. [[:de:MediaWiki_Diskussion:Spam-blacklist#state.gift]]. There is no useful usage for any article of any language. [[User:Boshomi|Boshomi]] ([[User talk:Boshomi|talk]]) 20:56, 1 August 2016 (UTC)
=== qoo.by ===
: {{LinkSummary|qoo.by}}
: Url shortener. [[User:Track13|Track13]] <sup>[[:ru:User Talk:Track13|0_o]] </sup> 15:33, 8 August 2016 (UTC)

:{{rto|Track13}} {{Added}} to [[Spam blacklist]]. --[[User:Syum90|Syum90]] ([[User talk:Syum90|talk]]) 15:52, 8 August 2016 (UTC)

===testosteronesboosterweb.com===
{{spamlink|testosteronesboosterweb.com}}

Spambot ([[w:Special:Undelete/User:Lanayica/sandbox]]). [[User:MER-C|MER-C]] ([[User talk:MER-C|talk]]) 10:48, 10 August 2016 (UTC)

:{{rto|MER-C}} {{Added}} to [[Spam blacklist]]. [[User:Syum90|Syum90]] ([[User talk:Syum90|talk]]) 10:56, 10 August 2016 (UTC)

===url.org===
{{spamlink|url.org}}
Abused directory-type listing and url shortener &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 04:36, 11 August 2016 (UTC)

:{{Added}} &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 04:40, 11 August 2016 (UTC)

===eyeluminoushelps.com===
{{spamlink|eyeluminoushelps.com}}

Spambot ([[w:Special:Undelete/User:Bonwasdfita]]). [[User:MER-C|MER-C]] ([[User talk:MER-C|talk]]) 07:11, 11 August 2016 (UTC)

:{{rto|MER-C}} {{Added}} to [[Spam blacklist]]. --[[User:Syum90|Syum90]] ([[User talk:Syum90|talk]]) 08:23, 11 August 2016 (UTC)

== Proposed additions (Bot reported) ==
{{messagebox
|text='''This section is for domains which have been added to multiple wikis as observed by a bot.'''

These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see [[Spam blacklist/Help#COIBot_reports]]. Reports will automatically be archived by the bot when they get [[:Category:Stale XWiki reports|stale]] (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is [[User:COIBot|COIBot]]).

;Sysops
*If the report contains links to ''less than 5'' wikis, then only add it when it is really spam
*Otherwise just revert the link-additions, and close the report; closed reports will be reopened when spamming continues
*To close a report, change the LinkStatus template to closed ({{tlx|LinkStatus|closed}})
*Please place any notes in the discussion section ''below the HTML comment''
}}

===[[User:COIBot/XWiki|COIBot]]===
The LinkWatchers report domains meeting the following criteria:
* When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
* When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
* If ALL links are added by IPs, and the link is added to more than 1 wiki
* If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.

{{-}}{{hidden|1=COIBot's currently open XWiki reports|2={{User:COIBot/XWiki}}}}{{-}}

== Proposed removals ==
{{messagebox
|text=This section is for proposing that a website be ''un''listed; '''please add new entries at the ''bottom'' of the section'''.

Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{tlx|removed}} or {{tlx|declined}} and archived.

See also [[Spam blacklist/recurring requests|/recurring requests]] for repeatedly proposed (and refused) removals.

Notes:
* The addition or removal of a domain from the blacklist is '''not a vote'''; please do not bold the first words in statements.
* This page is for the removal of domains from the global blacklist, not for removal of domains from the blacklists of individual wikis. For those requests please take your discussion to the pertinent wiki, where such requests would be made at '''Mediawiki talk:Spam-blacklist''' at that wiki. [[toollabs:searchsbl|Search spamlists]] {{smaller|— remember to enter any relevant language code}}
}}
<!--NEW REQUESTS GO AT THE BOTTOM-->
=== tollyspice.com ===
{{LinkSummary|tollyspice.com}}
I doesnt know why iam not able to add this domain to the wikipedia references while iam able to add from all other websites. Please remove the block. I request!
:Not blacklisted on Meta - it would appear to be a local (en wp) issue. You should request removal there. Thanks --[[User:Herbythyme|<font color="green">Herby</font>]] <b><sup><small><span style="color:#90F">[[User talk:Herbythyme|talk thyme]]</span></small></sup></b> 14:02, 8 April 2016 (UTC)

:{{Declined|Nothing to do}}, not globally blacklisted. It is blacklisted at en.wp per [https://en.wikipedia.org/w/index.php?title=MediaWiki_talk:Spam-blacklist&oldid=712354436#tollyspice.com this request]. As commented by Herby you should request removing there. [[User:Syum90|Syum90]] ([[User talk:Syum90|talk]]) 09:37, 26 April 2016 (UTC)

=== mpoc.org.my ===
* {{LinkSummary|mpoc.org.my}}

The link is the official website of Malaysian Palm Oil Council. Actually the council is under the [[:w:Ministry of Plantation Industries and Commodities (Malaysia)]]. May I know why mpoc.org.my was blacklisted before? Kindly hope the link is removed from blacklist. [[User:Alexander Iskandar|Alexander Iskandar]] ([[User talk:Alexander Iskandar|talk]]) 12:27, 21 March 2016 (UTC)

:This is not blacklisted here but on local Wikis (presumably they had reason) so cannot be removed from here. Local Wiki requests would be needed. --[[User:Herbythyme|<font color="green">Herby</font>]] <b><sup><small><span style="color:#90F">[[User talk:Herbythyme|talk thyme]]</span></small></sup></b> 14:05, 8 April 2016 (UTC)

:{{Declined|Nothing to do}}, not globally blacklisted. It is blacklisted at several wikis as you can see [[User:COIBot/LinkReports/mpoc.org.my#Links|here]]. As commented by Herby you should request removing there. [[User:Syum90|Syum90]] ([[User talk:Syum90|talk]]) 09:46, 26 April 2016 (UTC)

=== bitcointalk.org ===
{{LinkSummary|bitcointalk.org}}
Most popular English bitcoin forum, appears to have been added to the blacklist due to spam, but it contains much information about bitcoin and altcoin history and I would consider it a good source because it is the main publishing source for many of the official developments of alternate cryptocurrencies, links have been requested for whitelisting multiple times on the English wikipedia (e.g. [[:w:Dogecoin]], [[:w:Litecoin]]).{{Unsigned|Liance|01:19 26 March 2016 (UTC)}}

I see no reason for this website to be blocked. A partial match was also made for http://web.archive.org/web/20131016000457/https://bitcointalk(dot)org/index.php?topic=822.msg9519. Page concerned is History of Bitcoin on en.wiki, which says it is on the global blacklist.{{Unsigned|Kernosky|13:58 24 April 2016 (UTC)}}

: I too was confused by this. Apparently all forum links are blacklisted by default from what I see in previous discussions of bitcointalk(dot)org... I think there was also spam problems from that site. I am trying to get a specific link removed from the blacklist here: [https://en.wikipedia.org/wiki/MediaWiki_talk:Spam-whitelist#Quote_by_Andreas_Antonopoulos_on_Bitcoin_Talk Quote by Andreas Antonopoulos on Bitcoin Talk]. Will see how that goes. –[[User:JonathanCross|JonathanCross]] ([[User talk:JonathanCross|talk]]) 15:18, 25 April 2016 (UTC)
::You might be better to seek whitelisting of individual or component urls. Generally forums are not authoritative, and spammed urls of forums are quite problematic. Whitelists at local IPs are provided for exactly the reason to circumvent blocked domains. &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 05:21, 23 May 2016 (UTC)

=== tinapa.com.ng ===
{{LinkSummary|tinapa.com.ng}}
I don't understand why this website was blocked, but it is the official website of the [[:w:Tinapa Resort|Tinapa project]] and the block prevents me from adding the website as an official website to one of the pages that I'm working on: [[:w:Tinapa Shopping Complex|Tinapa Shopping Complex]].--[[User:Jamie Tubers|Jamie Tubers]] ([[User talk:Jamie Tubers|talk]]) 00:53, 20 April 2016 (UTC)
:It is blocked as it was being spammed by spambots. If you think that it should be added at English Wikipedia, then please apply for a whitelist, or partial whitelist at [[w:Mediawiki talk:Spam-whitelist]] &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 05:09, 23 May 2016 (UTC)

=== borgenproject.org ===
{{LinkSummary|borgenproject.org}}
This is a respected nonprofit organization that has been in operation for 14-years. They work at the political level combatting global poverty and I'm guessing they were flagged by someone with differing views. The organization has thousands of supporters and there has been lots of positive media coverage of the organization in The Seattle Times, Huffington Post and other media outlets.<small>—The preceding [[Help:Signature|unsigned]] comment was added by [[User:Madisonkoene|Madisonkoene]] ([[User talk:Madisonkoene|{{int:Talkpagelinktext}}]]) 20:02, 19 July 2016‎</small><!--added with Template:Unsigned-->
:{{rto|Madisonkoene}} wrong, please do not suggest inappropriate behaviour by requesters and/or blacklisting admins if you are not aware of the history. This was plainly spammed using different accounts and IPs on mainly en.wikipedia (but with a small cross-wiki aspect to it):
:*{{IPSummary|71.231.50.116}}
:*{{IPSummary|71.35.167.60}}
:*{{IPSummary|160.69.1.135}}
:*{{IPSummary|63.231.24.172}}
:*{{IPSummary|71.231.50.116}}
:*{{IPSummary|67.168.41.152}}
:*{{IPSummary|74.61.9.114}}
:*{{IPSummary|74.61.43.100}}
:*{{UserSummary|Bada12}}
:It is from the edits quite clear that someone was trying to give this organisation more exposure.
:Now, that being said, this has been on the list for 8 1/2 years.... can you indicate what is the ''use'' to Wikipedia of this link, and whether that use would be sufficient to consider de-blacklisting over selective whitelisting of a few links/domains? --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 05:33, 20 July 2016 (UTC)

=== rocketlanguages.com ===
{{LinkSummary|rocketlanguages.com}}
We are the owners of RocketLanguages.com which has been globally blacklisted. Please see our Wikipedia page: https://en.wikipedia.org/wiki/Rocket_Languages_(software). We would like to request removal from this list as we are a reliable source of language learning. We seem to have been blacklisted because of the site "url9.de" who we do not have any current affiliations with.

== Troubleshooting and problems ==
{{messagebox
|text=This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is ''not'' the section to request that an entry be unlisted (see '''Proposed removals''' above).
}}
=== derefer.unbubble.eu deblock ===
* {{LinkSummary|unbubble.eu}}
* {{LinkSummary|derefer.unbubble.eu}}

This authority is used '''[https://tools.wmflabs.org/giftbot/weblinksuche.fcgi?target=http://derefer.unbubble%25&namespace=0 24.923 times in main space]''' in dewiki!. It is used to clean up Special:Linksearch from known dead links, by redirecting them over this authority. It is hard to find a better solution for this task. --[[User:Boshomi|Boshomi]] ([[User talk:Boshomi|talk]]) 16:38, 24 July 2015 (UTC)
Ping:[[User:Billinghurst]][[User:Boshomi|Boshomi]] ([[User talk:Boshomi|talk]]) 16:49, 24 July 2015 (UTC)
: Please notice [[Phab:T89586]], while not fixed, it is not possible to find the links with standard special:LinkSearch. in dewiki we can use [[Wikipedia:LT/giftbot/weblinksuche |giftbot/Weblinksuche]] instead.--[[User:Boshomi|Boshomi]] ([[User talk:Boshomi|talk]]) 18:04, 24 July 2015 (UTC)
::afaics derefer.unbubble.eu could be used to circumvent the SBL, is that correct? -- [[User:Lustiger seth|seth]] ([[User talk:Lustiger seth|talk]]) 21:30, 24 July 2015 (UTC)
:::I don't think so, the redircted URL is unchanged, so the SBL works like the achive-URLs to the Internet Archive. --[[User:Boshomi|Boshomi]] ([[User talk:Boshomi|talk]]) 07:44, 25 July 2015 (UTC)
::::It is not a stored/archived page at archive.org, it is a redirect service as clearly stated at the URL and in that it obfuscates links. To describe it in any other way misrepresents the case, whether deWP uses it for good or not. We prevent abuseable redirects from other services due to the potential for abuse. You can consider whitelisting the URL in [[w:de:MediaWiki:spam-whitelist]] if it is a specific issue for your wiki. &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 10:09, 25 July 2015 (UTC)
::::: what I want to say was that the SBL-mechanism works in the same way like web.archive.org/web. A blocked URL will be blocked with unbubble-prefix to the blocked URL.--[[User:Boshomi|Boshomi]] ([[User talk:Boshomi|talk]]) 12:54, 25 July 2015 (UTC)

=== Unblocking YouTube's redirection and nocookie domains ===
* {{LinkSummary|youtube.com ...}}
<code>\byoutube\.com/.*(?:tqedszqxxzs|XePjp-H3TBI|khM48EQyVdc|A4jgXQQns8A|oVBOnv\-xrEY)\b</code>

Just for being curious I checked this list and entries seem to be pretty obsolete:

https://www.voutube.com/watch?v=tqedszqxxzs Dieses Video ist nicht verfügbar.

https://www.voutube.com/watch?v=XePjp-H3TBI vitruvian man 1 - Leo's it text.wmv

https://www.voutube.com/watch?v=khM48EQyVdc Dieses Video ist nicht verfügbar. (2011 Hunter Mariner)

https://www.voutube.com/watch?v=A4jgXQQns8A Unknown and 44" Hunter Baker Street ceiling fans

https://www.voutube.com/watch?v=oVBOnv-xrEY 48" Hunter Summer Breeze ceiling fan
''(^-replace voutube with youtube)''
so someone with write access please remove the whole line(as well as this entry here).
Also pretty strange how these came to the list. Are these just example entries?
[[User:Djamana|Djamana]] ([[User talk:Djamana|talk]]) 23:52, 2 February 2016 (UTC)
* {{LinkSummary|youtu.be}}
* {{LinkSummary|youtube-nocookie.com}}
* --- Note past closure - a vandal kept posting pictures and linking to videos of ceiling fans - that's why the bottom two are blocked. [[User:Kernosky|Kernosky]] ([[User talk:Kernosky|talk]]) 13:58, 24 April 2016 (UTC)
Apparently youtu(dot)be and youtube-nocookie(dot)com, both of which are official YouTube domains owned by Google, are on this blacklist. [https://phabricator.wikimedia.org/rESPB31e30af1b21531f38509122c666d06a99f171a27 For over ten years], the [[mw:Extension:SpamBlacklist|SpamBlacklist MediaWiki extension]] has loaded this blacklist on third-party wikis, big and small. This is quite an issue for third-party sites such as [http://www.shoutwiki.com/ ShoutWiki], a wiki farm, since SpamBlacklist doesn't currently have the concept of "shared" whitelists — blacklists can be shared (loaded from a remote wiki), whitelists cannot. Given that the main YouTube domain isn't blocked, and also that YouTube itself hands out youtu(dot)be links, I don't think that "[https://meta.wikimedia.org/w/index.php?title=Talk:Spam_blacklist&diff=next&oldid=4502570 but it's a redirecting service]" is a valid argument against it, and therefore I'd like to propose removing these two entries from the blacklist. --[[User:Jack Phoenix|Jack Phoenix]] <sub>([[User talk:Jack Phoenix|Contact]])</sub> 23:17, 29 August 2015 (UTC)

::There are several links on youtube blacklisted here on Meta, as well many, many on local wikis. Youtube has videos that get spammed, there are videos that should simply not be linked to. Leaving open the redirects then makes the issue that not only the youtube.com link needs to be blacklisted, but also all redirect to those links. That gives either extra work to the blacklisting editors, or leaves the easy back-door open. On wikis it leaves more material to check. That in combination with that redirect services are simply never needed, there is an alternative. Additionally, Wikipedia has their built-in redirect service which also works (I mean templates, like {{tl|youtube}}).
::That there is no meta-analogue of the whitelist is a good argument to push that request of years ago to re-vamp the spam-blacklist system through and have the developers focus on features that the community wants, and certainly not an argument for me to consider not to blacklist something. Moreover, I do not think that the argument that it hampers third-party wikis is an argument either - they choose to use this blacklist, they could alternatively set up their own 'meta blacklist' that they use, copy-pasting this blacklist and removing what they do not want/need.
::The problem exists internally as well, certain of our Wikifarms do allow for certain spam, which is however inappropriate on the rest of the wikifarms, and on the majority by far (in wiki-volume) of the wikis. That also needs a rewriting of the spam-blacklist system, which is crude, too difficult. A light-weight edit-filter variety, specialised on this would be way more suitable. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 04:05, 30 August 2015 (UTC)
:*'''Oppose''' unblocking for the reasons given above. [[User:Stifle|Stifle]] ([[User talk:Stifle|talk]]) 08:32, 21 October 2015 (UTC)

::{{decline}} &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 06:22, 24 January 2016 (UTC)
::::youtu.be can only be used for youtube.com, so it's no redirecting service, so remove it from the black list. If you need to block certain yt video (what I consider btw as a little stupid) just update that system and include youtube.com as well as youtu.be that's it.
::::[[User:Djamana|Djamana]] ([[User talk:Djamana|talk]]) 20:08, 2 February 2016 (UTC)
:::::{{rto|Djamana}} Why do you consider that blocking of a specific YouTube video a little stupid? --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 04:11, 3 February 2016 (UTC)
::::::{{rto|Djamana}} I did not see the above list earlier - out of the 5 on the meta spam blacklist there are still 3 active videos. Those 5 were abused and for the one of the cases where I was involved in (still active), that was pretty persistent promotion. I doubt that these need to be removed. The two that are specifically not there anymore could indeed be removed (or maybe they need to be corrected ..), still leaving 3. Moreover, these are not the only rules blocking youtube, also the many individual wikis have specific youtube videos blacklisted (and youtube ''can'' be used to earn money (and those are known to circumvent the blacklist; even regulars do!), and there is information there that simply should NEVER be linked to ..). Again {{declined}}. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 06:28, 14 April 2016 (UTC)


=== Partial matches: <change.org> blocks <time-to-change.org.uk> ===
* {{LinkSummary|change.org}}
* {{LinkSummary|time-to-change.org.uk}}

I tried to add a link to <time-to-change.org.uk>, and was told that I couldn't add the link, as <change.org> was blacklisted. Is this partial-match blacklisting (based, I guess, on an incorrect interpretation of URL specifications) a known bug? Cheers. --<span style="text-shadow:grey 0.15em 0.15em 0.1em">[[User:Yodin|Yodin]]</span><span style="text-shadow:grey 0.25em 0.25em 0.12em"><sup>[[User talk:Yodin|T]]</sup></span> 15:46, 21 October 2015 (UTC)
:This is more of a limitation to the regex, we tend to blacklist '\bchange\.org\b', but a '-' is also a 'word-end' (the \b). I'll see if I can adapt the rule. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 07:46, 22 October 2015 (UTC)
:change.org is not here, it is on en.wikipedia. That needs to be requested locally and then resolved there. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 07:48, 22 October 2015 (UTC)
::Thanks for looking into this; is it worth replacing the regexes globally to fit URL specs? I'm sure I'm not the only one who will ever be/have been affected. --<span style="text-shadow:grey 0.15em 0.15em 0.1em">[[User:Yodin|Yodin]]</span><span style="text-shadow:grey 0.25em 0.25em 0.12em"><sup>[[User talk:Yodin|T]]</sup></span> 11:27, 22 October 2015 (UTC)
:::{{rto|Yodin}} Sorry, but there are no global regexes to replace, change.org is only blacklisted on en.wikipedia. You'll have to request a change on [[:en:MediaWiki talk:Spam-blacklist]] (so there is a local request to do the change, then I or another en.wikipedia admin will implement it there). --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 11:38, 22 October 2015 (UTC)
::::Thanks Dirk; just read this (sorry for the repeat on regexes there!). Isn't the main blacklist here also using '\bexample\.com\b'? I can come up with the general case regex if you like! --<span style="text-shadow:grey 0.15em 0.15em 0.1em">[[User:Yodin|Yodin]]</span><span style="text-shadow:grey 0.25em 0.25em 0.12em"><sup>[[User talk:Yodin|T]]</sup></span> 11:44, 22 October 2015 (UTC)
:::::You mean for every rule to exclude the '<prefix>-'-rule (i.e. put '(?<!-)' before every rule in the list - well, some of them are meant to catch all '<blah>-something.com' sites, so that is difficult. And then there are other combinations which sometimes catch as well. It is practically impossible to rule out every false positive. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 12:01, 22 October 2015 (UTC)
::::::I see... much more complicated in practice than I thought. My idea was to apply it to a wider class of false positives, including the '<prefix>-' rule and more, by replacing "\b" with a regex rule which covers all and only the [https://tools.ietf.org/html/rfc3986#section-2.3 unreserved URI characters] (upper & lowercase letters, decimal digits, hyphen, underscore, and tilde; with "dots" used in practice as delimiters). But this wouldn't cover the '<blah>-something.com' examples you gave, and having read some of the maintenance thread below which covers false positives, I won't try to press the issue! Maybe one day? Until then, I hope this goes well! Cheers for your work! --<span style="text-shadow:grey 0.15em 0.15em 0.1em">[[User:Yodin|Yodin]]</span><span style="text-shadow:grey 0.25em 0.25em 0.12em"><sup>[[User talk:Yodin|T]]</sup></span> 12:26, 22 October 2015 (UTC)
:::::::{{rto|Yodin}} If the foundation finally decides that it is time to solve some old bugzilla requests (over other developments which sometimes find fierce opposition), and among those the ones regarding overhaul of the spam-blacklist system, then this would be nice 'feature requests' of that overhaul. In a way, stripping down the edit-filter to pure regex matching 'per rule', with some other options added (having a regex being applied to one page or set of pages; having the regex being excluded on one page only, having the whitelist requests being added to the blacklist rule they affect, whitelisting on one page or set of pages, etc. etc.) would be a great improvement to this system. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 14:25, 23 October 2015 (UTC)
:{{closing}} nothing to do, a block at enWP, nothing global. &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 09:45, 22 November 2015 (UTC)

=== non-ascii are not blocked? ===
* {{LinkSummary|казино-форум.рф}}

I saw <code>\bказино-форум\.рф\b</code> in the page, so it's supposed to be blocked. However, I can link it: http://казино-форум.рф
It seems like all non-ascii links will be able to avoid blocking.

In Thai Wikipedia (where I am an admin), there are a lot of Thai URLs that we want to put them in the local blacklist but we couldn't because of the very same reason. --[[User:Nullzero|Nullzero]] ([[User talk:Nullzero|talk]]) 17:42, 18 February 2016 (UTC)
:This should go to [[Phab:]] quickly - that is a real issue. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 05:52, 21 February 2016 (UTC)

:: {{ping|Beetstra}} Please see [[Phab:T28332]]. It seems that you need to put <code>\xd0\xba\xd0\xb0\xd0\xb7\xd0\xb8\xd0\xbd\xd0\xbe-\xd1\x84\xd0\xbe\xd1\x80\xd1\x83\xd0\xbc\.\xd1\x80\xd1\x84</code> ('''without <code>\b</code>''') instead of <code>\bказино-форум\.рф\b</code> --[[User:Nullzero|Nullzero]] ([[User talk:Nullzero|talk]]) 20:00, 21 February 2016 (UTC)
:::: *sigh* somehow the workaround doesn't work with Thai characters, so I don't know if <code>\xd0\xba\xd0\xb0\xd0\xb7\xd0\xb8\xd0\xbd\xd0\xbe-\xd1\x84\xd0\xbe\xd1\x80\xd1\x83\xd0\xbc\.\xd1\x80\xd1\x84</code> will actually work or not. Please try it anyway... --[[User:Nullzero|Nullzero]] ([[User talk:Nullzero|talk]]) 20:24, 21 February 2016 (UTC)

===Free domain names===
* {{LinkSummary|ml}}
* {{LinkSummary|ga}}
* {{LinkSummary|cf}}
* {{LinkSummary|gq}}

.ml, .ga, .cf, and .gq offer free domain names [http://www.freenom.com/en/index.html?lang=en]. I'm sick of playing whack-a-mole with the TV show spam; is there anything else we can do? [[User:MER-C|MER-C]] ([[User talk:MER-C|talk]]) 13:38, 8 April 2016 (UTC)

:{{rto|MER-C}} could easily be blacklisted, provided that there is not too much regular material that needs to be linked on those sites. What countries do these belong to? --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 06:38, 20 April 2016 (UTC)
:For .ml we have 1581 links on en.wikipedia. It looks the majority of that are .org.ml and .gov.ml and similar (many used as references). --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 06:41, 20 April 2016 (UTC)

::.gq looks like low hanging fruit: 21 links on en.wp, only half of which are in mainspace. [[User:MER-C|MER-C]] ([[User talk:MER-C|talk]]) 12:29, 2 May 2016 (UTC)
:::Less than half I would say, however some of those are 'official' (I see the registrar itself, and a university). Moreover, this blocks more than only en.wikipedia (though a quick check on other wikis does not enlarge the set of genuine links too much). If we write the rules so that the (large majority of the) currently used 'good' subdomains (on, say, the 5 major wikis) are exluded, I'll pull the trigger. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 13:11, 2 May 2016 (UTC)
::::{{ping|MER-C}} are we getting much spam outside of enWP? If the spam is centred on enWP, can we try the local blacklist there initially? &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 05:11, 23 May 2016 (UTC)
:::::The situation is mostly under control on enWP -- we now have a private abuse filter in front of the spam blacklist which works fairly well. They've now turned to spamming via facebook, which is something I struggle to care about. Blocking these domains isn't necessary at this moment, but one sees parallels with the .tk and .co.nr situation. [[User:MER-C|MER-C]] ([[User talk:MER-C|talk]]) 07:10, 27 May 2016 (UTC)

===google and springer together===
* {{LinkSummary|google.com}}
I do not understand regular expressions at all. On the Finnish Wikipedia, the following link was blacklisted:
<pre>
https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0ahUKEwj2y_u_p4zMAhWHKCwKHXfCDAgQFggkMAE
&url=http://www.springer.com/cda/content/document/cda_downloaddocument/9781441983893-c1.pdf
?SGWID%3D0-0-45-1149139-p174086675&usg=AFQjCNHKA4W5amgbAZGZXCgD5ZQy5tplQw&cad=rja
</pre>
Any ideas why? The IP in question has reported the problem on our [[:fi:Wikipedia:Ylläpitäjien_ilmoitustaulu#Roskapostisuodatin_esti_muokkaukseni|local admin noticeboard]], but I cannot help them. --[[User:Pxos|Pxos]] ([[User talk:Pxos|talk]]) 20:17, 13 April 2016 (UTC)

:{{user|Pxos}}, they should link directly to the Springer website. Google redirect links include link tracking. [[User:John Vandenberg|John Vandenberg]] ([[User talk:John Vandenberg|talk]]) 10:11, 19 April 2016 (UTC)

:{{rto|Pxos}} - use http://www.springer.com/cda/content/document/cda_downloaddocument/9781441983893-c1.pdf - this link is copied from the search-result page of google, it is not the actual link to the document. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 06:37, 20 April 2016 (UTC)
::{{closing}} direct links usable, redirecting links blocked. &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 15:08, 23 May 2016 (UTC)

=== Escaping dot in myrtlebeach.com regex ===
* <code>[a-z]myrtlebeach.com\b</code>

Should the <code>.</code> at the beginning of <code>.com</code> be escaped? – [[User:JonathanCross|JonathanCross]] ([[User talk:JonathanCross|talk]]) 15:01, 25 April 2016 (UTC)

{{ping|Billinghurst}} looks like you added the regex in [[User:COIBot/XWiki/websitedesigninmyrtlebeach.com#Entry|this revision]]. It was based on [[User:COIBot/XWiki/websitedesigninmyrtlebeach.com]] which suggests escaping the dot. – [[User:JonathanCross|JonathanCross]] ([[User talk:JonathanCross|talk]]) 14:49, 14 May 2016 (UTC)
:The regex has been removed. Not sure there is any ongoing issue unless we get spam again. &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 03:45, 15 May 2016 (UTC)
:: Ah, great, thanks! – [[User:JonathanCross|JonathanCross]] ([[User talk:JonathanCross|talk]]) 15:31, 15 May 2016 (UTC)

==Discussion==
{{messagebox
|text= This section is for discussion of Spam blacklist issues among other users.
}}
===Expert maintenance===
One (soon) archived and rejected removal suggestion was about '''jxlalk.com''' matched by a filter intended to block '''xlalk.com'''. One user suggested that this side-effect might be as it should be, another user suggested that regular expressions are unable to distinguish these cases, and nobody has a clue when and why '''xlalk.com''' was blocked. I suggest to find an expert maintainer for this list, and to remove all blocks older than 2010. The bots identifying abuse will restore still needed ancient blocks soon enough, hopefully without any '''oogle''' matching '''google''' cases. &ndash;[[User:Be..anyone|Be..anyone]] ([[User talk:Be..anyone|talk]]) 00:50, 20 January 2015 (UTC)
:No, removing some of the old rules, before 2010 or even before 2007, will result in further abuse, some of the rules are intentionally wide as to stop a wide range of spamming behaviour, and as I have argued as well, I have 2 cases on my en.wikipedia list where companies have been spamming for over 7 years, have some of their domains blacklisted, and are ''still'' actively spamming related domains. Every single removal should be considered on a case-by-case basis. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 03:42, 20 January 2015 (UTC)
::Just to give an example to this - redirect sites have been, and are, actively abused to circumvent the blacklist. Some of those were added before the arbitrary date of 2010. We are not going to remove those under the blanket of 'having been added before 2010', they will stay blacklisted. Some other domains are of similar gravity that they should never be removed. How are you, reasonably, going to filter out the rules that ''never'' should be removed. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 03:52, 20 January 2015 (UTC)
:By the way, you say ".. intended to block '''xlalk.com''' .." .. how do you know? --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 03:46, 20 January 2015 (UTC)
::I know that nobody would block <code>icrosoft.com</code> if what they mean is <code>microsoft.com</code>, or '''vice versa'''. It's no shame to have no clue about regular expressions, a deficit we apparently share.{{=P}} &ndash;[[User:Be..anyone|Be..anyone]] ([[User talk:Be..anyone|talk]]) 06:14, 20 January 2015 (UTC)
:::I am not sure what you are referring to - I am not native in regex, but proficient enough. The rule was added to block, at least, xlale.com and xlalu.com (if it were ONLY these two, \bxlal(u|e)\.com\b or \bxlal[ue]\.com\b would have been sufficient, but it is impossible to find this far back what all was spammed, possibly xlali.com, xlalabc.com and abcxlale.com were abused by these proxy-spammers. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 08:50, 20 January 2015 (UTC)
:xlalk.com may have been one of the cases, but one rule that was blacklisted ''before'' this blanket was imposed was 'xlale.com' (xlale.com rule was ''removed'' in a cleanout-session, after the blanket was added). --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 04:45, 20 January 2015 (UTC)
::The dots in administrative domains and DNS mean something, notably <code>foo.bar.example</code> is typically related to an administrative <code>bar.example</code> domain (ignoring well-known exceptions like <code>co.uk</code> etc., Mozilla+SURBL have lists for this), while <code>foobar.example</code> has nothing to do with <code>bar.example</code>. &ndash;[[User:Be..anyone|Be..anyone]] ([[User talk:Be..anyone|talk]]) 06:23, 20 January 2015 (UTC)
:::I know, but I am not sure how this relates to this suggested cleanup. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 08:50, 20 January 2015 (UTC)
::::If your suggested clean-ups at some point don't match '''j'''xlalk.com the request by a Chinese user would be satisfied&mdash;as noted all I found out is a VirusTotal "clean", it could be still a spam site if it ever was a spam site.
::::The regexp could begin with "optionally any string ending with a dot" or similar before '''xlalk'''. There are "host name" RFCs (LDH: letter digit hyphen) up to IDNAbis (i18n domains), they might contain recipes. &ndash;[[User:Be..anyone|Be..anyone]] ([[User talk:Be..anyone|talk]]) 16:56, 20 January 2015 (UTC)
:::::What suggested cleanups? I am not suggesting any cleanup or blanket removal of old rules. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 03:50, 21 January 2015 (UTC)
*I have supported delisting above, having researched the history, posted at [[Talk:Spam_blacklist/About#Old_blacklisting_with_scanty_history]]. If it desired to keep xlale.com and xlalu.com on the blacklist (though it's useless at this point), the shotgun regex could be replaced with two listings, easy peasy. --[[User:Abd|Abd]] ([[User talk:Abd|talk]]) 01:42, 21 January 2015 (UTC)
*:As I said earlier, are you sure that it is only xlale and xlalu, those were the two I found quickly, there may have been more, I do AGF that the admin who added the rule had reason to blanket it like this. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 03:50, 21 January 2015 (UTC)
:::Of course I'm not sure. There is no issue of bad faith. He had reason to use regex, for two sites, and possibly suspected additional minor changes would be made. But he only cited two sites. One of the pages was deleted, and has IP evidence on it, apparently, which might lead to other evidence from other pages, including cross-wiki. But the blacklistings themselves were clearly based on enwiki spam and nothing else was mentioned. This blacklist was the enwiki blacklist at that time. After enwiki got its own blacklist, the admin who blacklisted here attempted to remove all his listings. This is really old and likely obsolete stuff. --[[User:Abd|Abd]] ([[User talk:Abd|talk]]) 20:07, 21 January 2015 (UTC)
::::3 at least. And we do not have to present a full case for blacklisting (we often don't, per [[:en:WP:BEANS]] and sometimes privacy concerns), we have to show sufficient abuse that needs to be stopped. And if that deleted page was mentioned, then certainly there was reason to believe that there were cross-wiki concerns.
::::Obsolete, how do ''you'' know? Did you go through the cross-wiki logs of what was attempted to be spammed? Do you know how often some of the people active here are still blacklisting spambots using open proxies? Please stop with these sweeping statements until you have fully searched for all evidence. 'After enwiki got its own blacklist, the admin who blacklisted here attempted to remove all his listings.' - no, that was not what happened. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 03:16, 22 January 2015 (UTC)
:::::Hi!
:::::I searched all the logs (Special:Log/spamblacklist) of several wikis using the regexp entry /xlal[0-9a-z-]*\.com/.
:::::There were almost no hits:
w:ca: 0
w:ceb: 0
w:de: 0
w:en: 1: 20131030185954, xlalliance.com
w:es: 1: 20140917232510, xlalibre.com
w:fr: 0
w:it: 0
w:ja: 0
w:nl: 0
w:no: 0
w:pl: 0
w:pt: 0
w:ru: 0
w:sv: 0
w:uk: 0
w:vi: 0
w:war: 0
w:zh: 1: 20150107083744, www.jxlalk.com
:::::So there was just one single hit at w:en (not even in the main namespace, but in the user namespace), one in w:es, and one in w:zh (probably a false positive). So I agree with [[user:Abd]] that removing of this entry from the sbl would be the best solution. -- [[User:Lustiger seth|seth]] ([[User talk:Lustiger seth|talk]]) 18:47, 21 February 2015 (UTC)
::::::Finally an argument based on evidence (these logs should be public, not admin-only - can we have something like this in a search-engine, this may come in handy in some cases!). Consider removed. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 06:59, 22 February 2015 (UTC)
::::::By the way, [[User:Lustiger seth|Seth]], this is actually no hits - all three you show here are collateral. Thanks for this evidence, this information would be useful on more occasions to make an informed decision (also, ''vide infra''). --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 07:25, 22 February 2015 (UTC)
:::::::I am not sure that we want the Special page to be public, though I can see some value in being able to have something at ToolLabs to be available to run queries, or something available to be run through [http://quarry.wmflabs.org quarry]. &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 10:57, 22 February 2015 (UTC)
::::::::Why not public? There is no reason to hide this, this is not BLP or COPYVIO sensitive information in 99.99% of the hits. The chance that this is non-public information is just as big as for certain blocks to be BLP violations (and those ''are'' visible) ... --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 04:40, 23 February 2015 (UTC)

==== Now restarting the original debate ====
As the blacklist ''is'' long, and likely contains rules that are too wide a net and which are so old that they are utterly obsolete (or even, may be giving collateral damage on a regular basis), can we see whether we can set up some criteria (that can be 'bot tested'):
# Rule added > 5 years ago.
# All hits (determined on a significant number of wikis), over the last 2 years (for now: since the beginning of the log = ~1.5 years) are collateral damage - NO real hits.
# Site is not a redirect site (should not be removed, even if not abused), is not a known phishing/malware site (to protect others), or a true copyright violating ''site''. (this is hard to bot-test, we may need s.o. to look over the list, take out the obvious ones).
We can make some mistakes on old rules if they are not abused (remove some that actually fail #3) - if they become a nuisance/problem again, we will see them again, and they can be speedily re-added .. thoughts? --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 07:25, 22 February 2015 (UTC)
:@{{ping|hoo man}} you have worked on clean up before, some of your thoughts would be welcomed. &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 10:53, 22 February 2015 (UTC)
::Doing this kind of clean up is rather hard to automatize. What might be working better for starters could be removing rules that didn't match anything since we started logging hits. That would presumably cut down the whole blacklist considerably. After that we could re-evaluate the rest of the blacklist, maybe following the steps outlined above. - [[user:Hoo man|Hoo man]] <small>([[user talk:Hoo man|talk]])</small> 13:33, 22 February 2015 (UTC)

:::Not hitting anything is dangerous .. there are likely some somewhat obscure redirect sites on it which may not have been attempted to be abused (though, also those could be re-added). But we could do test-runs easily - just save a cleaned up copy of the blacklist elsewhere, and diff them against the current list, and see what would get removed.
:::Man, I want this showing up in the RC-feeds, then LiWa3 could store them in the database (and follow redirects to show what people wanted to link to ..). --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 03:30, 23 February 2015 (UTC)

:Hi!
:I created a table of hits of blocked link additions. Maybe it's of use for the discussion: [[User:lustiger_seth/sbl_log_stats]] (1,8 MB wiki table).
:I'd appreciate, if we deleted old entries. -- [[User:Lustiger seth|seth]] ([[User talk:Lustiger seth|talk]]) 22:12, 26 February 2015 (UTC)
::Hi, thank you for this, it gives a reasonable idea. Do you know if the rule-hits were all 'correct' (for those that do show that they were hit) or mainly/all false-positives (if they are false-positive hitting, we could based on this also decide to tighten the rule to avoid the false-positives). Rules with all-0 (can you include a 'total' score) would certainly be candidates for removal (though still determine first whether they are 'old' and/or are nono-sites before removal). I am also concerned that this is not including other wikifarms - some sites may be problematic on other wikifarms, or hitting a large number of smaller wikis (which have less control due to low admin numbers). --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 03:36, 8 March 2015 (UTC)
:::Hi!
:::We probably can't get information of false positives automatically. I added a 'sum' column.
:::Small wikis: If you give me a list of the relevant ones, I can create another list. -- [[User:Lustiger seth|seth]] ([[User talk:Lustiger seth|talk]]) 10:57, 8 March 2015 (UTC)
::::Thanks for the sum-column. Regarding the false-positives, it would be nice to be able to quickly see what actually got blocked by a certain rule, I agree that that then needs a manual inspection, but the actual number of rules with zero hits on the intended stuff to be blocked is likely way bigger than what we see.
::::How would you define the relevant small wikis - that is depending on the link that was spammed? Probably the best is to parse all ~750 wiki's, make a list of rules with 0 hits, and a separate list of rules with <10 hits (and including there the links that were blocked), and exclude everything above that. Then these resulting rules should be filtered by those which were added >5 years ago. That narrows down the list for now, and after a check for obvious no-no links, those could almost be blanket-removed (just excluding the ones with real hits, the obvious redirect sites and others - which needs a manual check). --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 06:59, 9 March 2015 (UTC)
:::::Hi!
:::::At [[User:Lustiger_seth/sbl_log_stats/all_wikis_no_hits]] there's a list containing ~10k entries that never triggered the sbl during 2013-sep and 2015-feb anywhere (if my algorithm is correct).
:::::If you want to get all entries older than 5 years, then it should be sufficent to use only the entries in that list until (and including) <code>\bbudgetgardening\.co\.uk\b</code>.
:::::So we could delete ~5766 entries. What do think? Shall we give it a try? -- [[User:Lustiger seth|seth]] ([[User talk:Lustiger seth|talk]]) 17:06, 18 April 2015 (UTC)
::::::The question is, how many of those are still existing redirect sites etc. Checking 5800 is quite a job. On the other hand, with LiWa3/COIBot detecting - it is quite easy to re-add them. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 19:28, 21 April 2015 (UTC)
:::::::According to the last few lines, I've removed 124kB of non-hitting entries now. I did not remove all of them, because some were url shorteners and I guess, that they are a special case, even if not used yet. -- [[User:Lustiger seth|seth]] ([[User talk:Lustiger seth|talk]]) 22:25, 16 September 2015 (UTC)

==== Blacklisting spam URLs used in references ====
Looks like a site is using the "references" section as a spam farm. If a site is added to this list, can the blacklist block the spam site? [[User:Raysonho|Raysonho]] ([[User talk:Raysonho|talk]]) 17:45, 5 September 2015 (UTC)
:Yes they can.--<font style="font-weight: bold; background-color: #FF0000; color: #ffffff;">[[User:Aldnonymous|AldNonymous]]</font><sup>[[User_talk:Aldnonymous|Bicara?]]</sup> 21:56, 5 September 2015 (UTC)
::Thanks, Aldnonymous! [[User:Raysonho|Raysonho]] ([[User talk:Raysonho|talk]]) 00:07, 6 September 2015 (UTC)

=== url shorteners ===
Hi!<br />
IMHO the url shorteners should be grouped in one section, because they are a special group of urls that need a special treatment. A url shortener should not be removed from sbl unless the domain is dead, even if it has not been used for spamming, right? -- [[User:Lustiger seth|seth]] ([[User talk:Lustiger seth|talk]]) 22:11, 28 September 2015 (UTC)
:That would be beneficial to have them in a section. Problem is, most of them are added by script, and are hence just put at the bottom. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 04:51, 4 October 2015 (UTC)
::Maybe it would seem more preferable to have "spam blacklist" be a compilation file, made of files one of which would be "spam blacklist.shorteners" &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 12:15, 24 December 2015 (UTC)
:::This seems like a nice idea. Would certainly help with cleaning up of it (which we don't do nowadays). IIRC, it is technically possible to have different spam blacklist pages so this is technically possible, just needs a agreement among us and someone to do it. --[[User:Glaisher|Glaisher]] ([[User talk:Glaisher#top|talk]]) 12:17, 24 December 2015 (UTC)
{{ping|Beetstra|Lustiger seth|Glaisher|Vituzzu|MarcoAurelio|Hoo man|Legoktm}} and others. What are your thoughts on a concatenation of files as described above. If we have a level of agreement, then we can work out the means to an outcome. &nbsp;— [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 12:39, 25 January 2016 (UTC)
:* I am somewhat in favour of this - split the list into a couple of sublists - one for url-shorteners, one for 'general terms' (mainly at the top of the list currently), and the regular list. It would however need an adaptation of the blacklist script (I've done something similar for en.wikipedia (a choice of blacklisting or revertlisting for each link), I could give that hack a try here, time permitting). AFAIK the extension in the software is capable of handling this. Also, it would be beneficial for the cleanout work, that the blacklist itself is 'sectioned' into years. Although being 8 years old is by no means a reason to expect that the spammers are not here anymore (I have two cases on en.wikipedia that are older than that), we do tend to be more lenient with the old stuff. (on the other hand .. why bother .. the benefits are mostly on our side so we don't accidentally remove stuff that should be solved by other means). --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> (en: [[:en:User:Beetstra|U]], [[:en:User talk:Beetstra|T]]) 13:05, 25 January 2016 (UTC)
:: Is it really possible to have different spam blacklist pages? What would happen to the sites that use this very list to block unwanted spam? &mdash;[[User:MarcoAurelio|Marco]][[User talk:MarcoAurelio|Aurelio]] 14:23, 25 January 2016 (UTC)
:::It is technically possible. But this would mean that if we move all the URL shortener entries to a new page, all sites using it currently would have to update the extension or explicitly add the new blacklist to their config or these links would be allowed on their sites (and notifying all these wikis about this breaking change is next to impossible). Another issue I see is that a new blacklist file means there would be a separate network request on cache miss so their might be a little delay in page saves (but I'm not sure whether this delay would be a noticeable delay). --[[User:Glaisher|Glaisher]] ([[User talk:Glaisher#top|talk]]) 15:38, 25 January 2016 (UTC)

::::Hi!
::::Before we activate such a feature, we should update some scripts that don't know anything about sbl subpages yet.
::::Apart from that I don't think that a sectioning into years would be of much use. One can use the (manual) log for this. A subject-oriented sectioning could be of more use, but this would also be more difficult for us. -- [[User:Lustiger seth|seth]] ([[User talk:Lustiger seth|talk]]) 20:49, 27 January 2016 (UTC)

=== Unreadable ===
Why is the list not alphabetical, so I can look up whether a certain site is listed and then also look up when it was added? --[[User:Corriebertus|Corriebertus]] ([[User talk:Corriebertus|talk]]) 08:55, 21 October 2015 (UTC)
:hi!
:there are advantages and disadvantages of a alphabetical list. for example it would be very helpful to group all url shorteners at one place (see discussion thread above). sometimes it's better to have a chronological list. additionally to that regexp can't be really sorted domain-alphabetically.
:if you want search the blacklist, you can use a tool like https://tools.wmflabs.org/searchsbl/. -- [[User:Lustiger seth|seth]] ([[User talk:Lustiger seth|talk]]) 17:16, 30 October 2015 (UTC)

Revision as of 15:40, 11 August 2016

Shortcut:
WM:SPAM
WM:SBL
The associated page is used by the MediaWiki Spam Blacklist extension, and lists regular expressions which cannot be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any Meta administrator can edit the spam blacklist; either manually or with SBHandler. For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.

Proposed additions
Please provide evidence of spamming on several wikis. Spam that only affects a single project should go to that project's local blacklist. Exceptions include malicious domains and URL redirector/shortener services. Please follow this format. Please check back after submitting your report, there could be questions regarding your request.
Proposed removals
Please check our list of requests which repeatedly get declined. Typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects. Please consider whether requesting whitelisting on a specific wiki for a specific use is more appropriate - that is very often the case.
Other discussion
Troubleshooting and problems - If there is an error in the blacklist (i.e. a regex error) which is causing problems, please raise the issue here.
Discussion - Meta-discussion concerning the operation of the blacklist and related pages, and communication among the spam blacklist team.
#wikimedia-external-linksconnect - Real-time IRC chat for co-ordination of activities related to maintenance of the blacklist.
Whitelists
There is no global whitelist, so if you are seeking a whitelisting of a url at a wiki then please address such matters via use of the respective Mediawiki talk:Spam-whitelist page at that wiki, and you should consider the use of the template {{edit protected}} or its local equivalent to get attention to your edit.

Please sign your posts with ~~~~ after your comment. This leaves a signature and timestamp so conversations are easier to follow.


Completed requests are marked as {{added}}/{{removed}} or {{declined}}, and are generally archived quickly. Additions and removals are logged · current log 2024/05.


Proposed additions

This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users on multiple wikis. Completed requests will be marked as {{added}} or {{declined}} and archived.

several shocksites/screamers





1man1jar, the 1man1jar mirror (jarsquatter), and findminecraft (which is a screamer) Links are innapropriate for use. 96.237.20.248 19:12, 17 January 2016 (UTC)Reply

@96.237.20.248: Can you please paste the domainnames in a linksummary template, so for the domain 'example.com' you would put them here (each on one line) '{tl|LinkSummary|example.com}}'. Note that just being inappropriate is not necessarily sufficient reason to blacklist them, we'd need evidence of abuse as well. Thanks! --Dirk Beetstra T C (en: U, T) 03:31, 18 January 2016 (UTC)Reply

1man1jar.com and findminecraft.com (DO NOT CLICK); evidence of abuse here and pretty much just that. Cross out findminecraft.

netflix spammer



  • facebook.com/Netflixsteraming100

It may even be worth to consider 'steraming' typo varieties. Ping MER-C. --Dirk Beetstra T C (en: U, T) 03:43, 13 March 2016 (UTC)Reply

I don't particularly care about spam on Facebook, even if it appears on Wikimedia sites. Please report it if you have an account (I don't) as links to pirated TV shows are likely against their TOS. MER-C (talk) 08:35, 14 March 2016 (UTC)Reply
@MER-C: What there is on Facebook I don't care either (I do have an account) .. but if links to Facebook get spammed here like with this link, maybe the whole could go on the blacklist. I pinged you at first because of the 'typosquatting': steraming .. worth just blocking that whole word and taking out more than just this? --Dirk Beetstra T C (en: U, T) 08:39, 14 March 2016 (UTC)Reply










Some IPs that XLinkBot caught. --Dirk Beetstra T C (en: U, T) 08:43, 14 March 2016 (UTC)Reply

Hmm .. also permalinks into facebook .. https://en.wikipedia.org/w/index.php?title=Ex_on_the_Beach_(series_4)&diff=prev&oldid=709006958 .. difficult to weed out and block all of those. --Dirk Beetstra T C (en: U, T) 08:47, 14 March 2016 (UTC)Reply

browse-read.com



Stumbled accross this in an OTRS complaint (Ticket#2016030310023685) as someone had removed a link to a seemingly valid page. Searching a little to see if the site was legit I found that only some pages at site:browse-read.com was available, and searching some more I found that some of the pages was the usual "hot pics" and not only "hot vegetables". The hot pics seems to have disapeared since. I removed the link at nowiki, but was reverted at enwiki. I removed one to many links at enwiki, sorry.

Looking at the site I wonder if this is an up and coming linkfarm that has a legit front page and only lets users navigate to other legit pages. It can also be an old domain that is reused for a new site, and that is the reason for the strange hits in search engines. It could be interesting to identify whats really going on with this site, but I don't have the time to further investigation. — Jeblad 14:49, 13 March 2016 (UTC)Reply

seosprint.net



seo-site working method bringing referrals. The site does not contain any useful information, but the risk add spam links. --Максим Підліснюк (talk) 01:30, 11 May 2016 (UTC)Reply

@Максим Підліснюк: Hi, as far I can see it is a single wiki issue, so please request local blacklisting first. Regards.--Syum90 (talk) 16:23, 10 August 2016 (UTC)Reply

bilder-hamburg.info



  • Spammed on: many different (also smaller) Wikipedias (maybe also WikiVoyage?)
  • Topic: tourist features in Germany (and especially Hamburg)
  • no user name
  • many different dynamic IP addresses. Examples:






There are more than plenty pictures of those POIs at Wikimedia Commons. We do not need to link extensively to a single website for such common images. -- 77.6.13.233 21:14, 28 April 2016 (UTC)Reply

While I understand why this has been brought here, I am loathe to act when the individual wikis have not. The domain is not added to any blacklists, and the linking xwiki is along the lines of 279 records; Top 10 wikis where bilder-hamburg.info has been added: w:en (23), w:zh (9), w:sv (8), w:ru (8), w:ja (7), w:pt (6), w:ko (6), w:nl (5), w:no (5), w:tr (5). I would like to see a larger commentary, or clear indication that there is abuse by seeing the wikis removing these links.  — billinghurst sDrewth 15:05, 23 May 2016 (UTC)Reply

translate.google.[a-z]{2,5}/translate und translate.googleusercontent.com/translate





Hi!
There have been already several discussions on that topic (see Talk:Spam_blacklist/Archives/2008-02#WebWarper, Talk:Spam_blacklist/Archives/2013-04#Google_Translate_as_a_universal_URL_redirector), but I think it won't do any harm to discuss again. :-)
As demonstrated at dewiki it is possible to use the google-translator to circumvent the sbl. user:Boshomi stated that in at least 2 cases the google-translator was used with blacklisted urls. Still I'm not sure, whether the use of such translations on talk pages might be bigger than the risk of sbl circumvention. I guess at least in dewiki in main namespace the google translations are unwanted.
Maybe in future there should be developed a mediawiki extension for automatic translations that does not depend on just one (google) translator. -- seth (talk) 06:40, 15 June 2016 (UTC)Reply

@Lustiger seth: our general rule on url shorteners is to blacklist on sight (even before use occurs). IMHO, there is in mainspace no reason to link ever to these services (you link to the data, and then click 'translate' yourself if that is necessary - I would get a bit cross if I got automatically redirected to an English translation of a document in Dutch, German, Frysian, Italian .... It is not to the editor to decide which translation I should use, and if they used a translation to reference an article then that is a dangerous practice in itself). And actually that is similar outside of mainspace. I think this could be blacklisted similar to the /url? link of google search results pages. --Dirk Beetstra T C (en: U, T) 11:05, 15 June 2016 (UTC)Reply

state.gift



copy of wikipedia, but it ignors the terms of licences. de:MediaWiki_Diskussion:Spam-blacklist#state.gift. There is no useful usage for any article of any language. Boshomi (talk) 20:56, 1 August 2016 (UTC)Reply

qoo.by



Url shortener. Track13 0_o 15:33, 8 August 2016 (UTC)Reply
@Track13: Added Added to Spam blacklist. --Syum90 (talk) 15:52, 8 August 2016 (UTC)Reply

testosteronesboosterweb.com



Spambot (w:Special:Undelete/User:Lanayica/sandbox). MER-C (talk) 10:48, 10 August 2016 (UTC)Reply

@MER-C: Added Added to Spam blacklist. Syum90 (talk) 10:56, 10 August 2016 (UTC)Reply

url.org



Abused directory-type listing and url shortener  — billinghurst sDrewth 04:36, 11 August 2016 (UTC)Reply

Added Added  — billinghurst sDrewth 04:40, 11 August 2016 (UTC)Reply

eyeluminoushelps.com



Spambot (w:Special:Undelete/User:Bonwasdfita). MER-C (talk) 07:11, 11 August 2016 (UTC)Reply

@MER-C: Added Added to Spam blacklist. --Syum90 (talk) 08:23, 11 August 2016 (UTC)Reply

Proposed additions (Bot reported)

This section is for domains which have been added to multiple wikis as observed by a bot.

These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see Spam blacklist/Help#COIBot_reports. Reports will automatically be archived by the bot when they get stale (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is COIBot).

Sysops
  • If the report contains links to less than 5 wikis, then only add it when it is really spam
  • Otherwise just revert the link-additions, and close the report; closed reports will be reopened when spamming continues
  • To close a report, change the LinkStatus template to closed ({{LinkStatus|closed}})
  • Please place any notes in the discussion section below the HTML comment

COIBot

The LinkWatchers report domains meeting the following criteria:

  • When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
  • When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
  • If ALL links are added by IPs, and the link is added to more than 1 wiki
  • If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.
COIBot's currently open XWiki reports
List Last update By Site IP R Last user Last link addition User Link User - Link User - Link - Wikis Link - Wikis
vrsystems.ru 2023-06-27 15:51:16 COIBot 195.24.68.17 192.36.57.94
193.46.56.178
194.71.126.227
93.99.104.93
2070-01-01 05:00:00 4 4

Proposed removals

This section is for proposing that a website be unlisted; please add new entries at the bottom of the section.

Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{removed}} or {{declined}} and archived.

See also /recurring requests for repeatedly proposed (and refused) removals.

Notes:

  • The addition or removal of a domain from the blacklist is not a vote; please do not bold the first words in statements.
  • This page is for the removal of domains from the global blacklist, not for removal of domains from the blacklists of individual wikis. For those requests please take your discussion to the pertinent wiki, where such requests would be made at Mediawiki talk:Spam-blacklist at that wiki. Search spamlists — remember to enter any relevant language code

tollyspice.com



I doesnt know why iam not able to add this domain to the wikipedia references while iam able to add from all other websites. Please remove the block. I request!

Not blacklisted on Meta - it would appear to be a local (en wp) issue. You should request removal there. Thanks --Herby talk thyme 14:02, 8 April 2016 (UTC)Reply
 Nothing to do, not globally blacklisted. It is blacklisted at en.wp per this request. As commented by Herby you should request removing there. Syum90 (talk) 09:37, 26 April 2016 (UTC)Reply

mpoc.org.my



The link is the official website of Malaysian Palm Oil Council. Actually the council is under the w:Ministry of Plantation Industries and Commodities (Malaysia). May I know why mpoc.org.my was blacklisted before? Kindly hope the link is removed from blacklist. Alexander Iskandar (talk) 12:27, 21 March 2016 (UTC)Reply

This is not blacklisted here but on local Wikis (presumably they had reason) so cannot be removed from here. Local Wiki requests would be needed. --Herby talk thyme 14:05, 8 April 2016 (UTC)Reply
 Nothing to do, not globally blacklisted. It is blacklisted at several wikis as you can see here. As commented by Herby you should request removing there. Syum90 (talk) 09:46, 26 April 2016 (UTC)Reply

bitcointalk.org



Most popular English bitcoin forum, appears to have been added to the blacklist due to spam, but it contains much information about bitcoin and altcoin history and I would consider it a good source because it is the main publishing source for many of the official developments of alternate cryptocurrencies, links have been requested for whitelisting multiple times on the English wikipedia (e.g. w:Dogecoin, w:Litecoin).— The preceding unsigned comment was added by Liance (talk) 01:19 26 March 2016 (UTC)

I see no reason for this website to be blocked. A partial match was also made for http://web.archive.org/web/20131016000457/https://bitcointalk(dot)org/index.php?topic=822.msg9519. Page concerned is History of Bitcoin on en.wiki, which says it is on the global blacklist.— The preceding unsigned comment was added by Kernosky (talk) 13:58 24 April 2016 (UTC)

I too was confused by this. Apparently all forum links are blacklisted by default from what I see in previous discussions of bitcointalk(dot)org... I think there was also spam problems from that site. I am trying to get a specific link removed from the blacklist here: Quote by Andreas Antonopoulos on Bitcoin Talk. Will see how that goes. –JonathanCross (talk) 15:18, 25 April 2016 (UTC)Reply
You might be better to seek whitelisting of individual or component urls. Generally forums are not authoritative, and spammed urls of forums are quite problematic. Whitelists at local IPs are provided for exactly the reason to circumvent blocked domains.  — billinghurst sDrewth 05:21, 23 May 2016 (UTC)Reply

tinapa.com.ng



I don't understand why this website was blocked, but it is the official website of the Tinapa project and the block prevents me from adding the website as an official website to one of the pages that I'm working on: Tinapa Shopping Complex.--Jamie Tubers (talk) 00:53, 20 April 2016 (UTC)Reply

It is blocked as it was being spammed by spambots. If you think that it should be added at English Wikipedia, then please apply for a whitelist, or partial whitelist at w:Mediawiki talk:Spam-whitelist  — billinghurst sDrewth 05:09, 23 May 2016 (UTC)Reply

borgenproject.org



This is a respected nonprofit organization that has been in operation for 14-years. They work at the political level combatting global poverty and I'm guessing they were flagged by someone with differing views. The organization has thousands of supporters and there has been lots of positive media coverage of the organization in The Seattle Times, Huffington Post and other media outlets.—The preceding unsigned comment was added by Madisonkoene (talk) 20:02, 19 July 2016‎

@Madisonkoene: wrong, please do not suggest inappropriate behaviour by requesters and/or blacklisting admins if you are not aware of the history. This was plainly spammed using different accounts and IPs on mainly en.wikipedia (but with a small cross-wiki aspect to it):


















It is from the edits quite clear that someone was trying to give this organisation more exposure.
Now, that being said, this has been on the list for 8 1/2 years.... can you indicate what is the use to Wikipedia of this link, and whether that use would be sufficient to consider de-blacklisting over selective whitelisting of a few links/domains? --Dirk Beetstra T C (en: U, T) 05:33, 20 July 2016 (UTC)Reply

rocketlanguages.com



We are the owners of RocketLanguages.com which has been globally blacklisted. Please see our Wikipedia page: https://en.wikipedia.org/wiki/Rocket_Languages_(software). We would like to request removal from this list as we are a reliable source of language learning. We seem to have been blacklisted because of the site "url9.de" who we do not have any current affiliations with.

Troubleshooting and problems

This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is not the section to request that an entry be unlisted (see Proposed removals above).

derefer.unbubble.eu deblock





This authority is used 24.923 times in main space in dewiki!. It is used to clean up Special:Linksearch from known dead links, by redirecting them over this authority. It is hard to find a better solution for this task. --Boshomi (talk) 16:38, 24 July 2015 (UTC) Ping:User:BillinghurstBoshomi (talk) 16:49, 24 July 2015 (UTC)Reply

Please notice Phab:T89586, while not fixed, it is not possible to find the links with standard special:LinkSearch. in dewiki we can use giftbot/Weblinksuche instead.--Boshomi (talk) 18:04, 24 July 2015 (UTC)Reply
afaics derefer.unbubble.eu could be used to circumvent the SBL, is that correct? -- seth (talk) 21:30, 24 July 2015 (UTC)Reply
I don't think so, the redircted URL is unchanged, so the SBL works like the achive-URLs to the Internet Archive. --Boshomi (talk) 07:44, 25 July 2015 (UTC)Reply
It is not a stored/archived page at archive.org, it is a redirect service as clearly stated at the URL and in that it obfuscates links. To describe it in any other way misrepresents the case, whether deWP uses it for good or not. We prevent abuseable redirects from other services due to the potential for abuse. You can consider whitelisting the URL in w:de:MediaWiki:spam-whitelist if it is a specific issue for your wiki.  — billinghurst sDrewth 10:09, 25 July 2015 (UTC)Reply
what I want to say was that the SBL-mechanism works in the same way like web.archive.org/web. A blocked URL will be blocked with unbubble-prefix to the blocked URL.--Boshomi (talk) 12:54, 25 July 2015 (UTC)Reply

Unblocking YouTube's redirection and nocookie domains



\byoutube\.com/.*(?:tqedszqxxzs|XePjp-H3TBI|khM48EQyVdc|A4jgXQQns8A|oVBOnv\-xrEY)\b

Just for being curious I checked this list and entries seem to be pretty obsolete:

https://www.voutube.com/watch?v=tqedszqxxzs Dieses Video ist nicht verfügbar.

https://www.voutube.com/watch?v=XePjp-H3TBI vitruvian man 1 - Leo's it text.wmv

https://www.voutube.com/watch?v=khM48EQyVdc Dieses Video ist nicht verfügbar. (2011 Hunter Mariner)

https://www.voutube.com/watch?v=A4jgXQQns8A Unknown and 44" Hunter Baker Street ceiling fans

https://www.voutube.com/watch?v=oVBOnv-xrEY 48" Hunter Summer Breeze ceiling fan (^-replace voutube with youtube) so someone with write access please remove the whole line(as well as this entry here). Also pretty strange how these came to the list. Are these just example entries? Djamana (talk) 23:52, 2 February 2016 (UTC)Reply





Apparently youtu(dot)be and youtube-nocookie(dot)com, both of which are official YouTube domains owned by Google, are on this blacklist. For over ten years, the SpamBlacklist MediaWiki extension has loaded this blacklist on third-party wikis, big and small. This is quite an issue for third-party sites such as ShoutWiki, a wiki farm, since SpamBlacklist doesn't currently have the concept of "shared" whitelists — blacklists can be shared (loaded from a remote wiki), whitelists cannot. Given that the main YouTube domain isn't blocked, and also that YouTube itself hands out youtu(dot)be links, I don't think that "but it's a redirecting service" is a valid argument against it, and therefore I'd like to propose removing these two entries from the blacklist. --Jack Phoenix (Contact) 23:17, 29 August 2015 (UTC)Reply

There are several links on youtube blacklisted here on Meta, as well many, many on local wikis. Youtube has videos that get spammed, there are videos that should simply not be linked to. Leaving open the redirects then makes the issue that not only the youtube.com link needs to be blacklisted, but also all redirect to those links. That gives either extra work to the blacklisting editors, or leaves the easy back-door open. On wikis it leaves more material to check. That in combination with that redirect services are simply never needed, there is an alternative. Additionally, Wikipedia has their built-in redirect service which also works (I mean templates, like {{youtube}}).
That there is no meta-analogue of the whitelist is a good argument to push that request of years ago to re-vamp the spam-blacklist system through and have the developers focus on features that the community wants, and certainly not an argument for me to consider not to blacklist something. Moreover, I do not think that the argument that it hampers third-party wikis is an argument either - they choose to use this blacklist, they could alternatively set up their own 'meta blacklist' that they use, copy-pasting this blacklist and removing what they do not want/need.
The problem exists internally as well, certain of our Wikifarms do allow for certain spam, which is however inappropriate on the rest of the wikifarms, and on the majority by far (in wiki-volume) of the wikis. That also needs a rewriting of the spam-blacklist system, which is crude, too difficult. A light-weight edit-filter variety, specialised on this would be way more suitable. --Dirk Beetstra T C (en: U, T) 04:05, 30 August 2015 (UTC)Reply
 Declined  — billinghurst sDrewth 06:22, 24 January 2016 (UTC)Reply
youtu.be can only be used for youtube.com, so it's no redirecting service, so remove it from the black list. If you need to block certain yt video (what I consider btw as a little stupid) just update that system and include youtube.com as well as youtu.be that's it.
Djamana (talk) 20:08, 2 February 2016 (UTC)Reply
@Djamana: Why do you consider that blocking of a specific YouTube video a little stupid? --Dirk Beetstra T C (en: U, T) 04:11, 3 February 2016 (UTC)Reply
@Djamana: I did not see the above list earlier - out of the 5 on the meta spam blacklist there are still 3 active videos. Those 5 were abused and for the one of the cases where I was involved in (still active), that was pretty persistent promotion. I doubt that these need to be removed. The two that are specifically not there anymore could indeed be removed (or maybe they need to be corrected ..), still leaving 3. Moreover, these are not the only rules blocking youtube, also the many individual wikis have specific youtube videos blacklisted (and youtube can be used to earn money (and those are known to circumvent the blacklist; even regulars do!), and there is information there that simply should NEVER be linked to ..). Again  Declined. --Dirk Beetstra T C (en: U, T) 06:28, 14 April 2016 (UTC)Reply


Partial matches: <change.org> blocks <time-to-change.org.uk>





I tried to add a link to <time-to-change.org.uk>, and was told that I couldn't add the link, as <change.org> was blacklisted. Is this partial-match blacklisting (based, I guess, on an incorrect interpretation of URL specifications) a known bug? Cheers. --YodinT 15:46, 21 October 2015 (UTC)Reply

This is more of a limitation to the regex, we tend to blacklist '\bchange\.org\b', but a '-' is also a 'word-end' (the \b). I'll see if I can adapt the rule. --Dirk Beetstra T C (en: U, T) 07:46, 22 October 2015 (UTC)Reply
change.org is not here, it is on en.wikipedia. That needs to be requested locally and then resolved there. --Dirk Beetstra T C (en: U, T) 07:48, 22 October 2015 (UTC)Reply
Thanks for looking into this; is it worth replacing the regexes globally to fit URL specs? I'm sure I'm not the only one who will ever be/have been affected. --YodinT 11:27, 22 October 2015 (UTC)Reply
@Yodin: Sorry, but there are no global regexes to replace, change.org is only blacklisted on en.wikipedia. You'll have to request a change on en:MediaWiki talk:Spam-blacklist (so there is a local request to do the change, then I or another en.wikipedia admin will implement it there). --Dirk Beetstra T C (en: U, T) 11:38, 22 October 2015 (UTC)Reply
Thanks Dirk; just read this (sorry for the repeat on regexes there!). Isn't the main blacklist here also using '\bexample\.com\b'? I can come up with the general case regex if you like! --YodinT 11:44, 22 October 2015 (UTC)Reply
You mean for every rule to exclude the '<prefix>-'-rule (i.e. put '(?<!-)' before every rule in the list - well, some of them are meant to catch all '<blah>-something.com' sites, so that is difficult. And then there are other combinations which sometimes catch as well. It is practically impossible to rule out every false positive. --Dirk Beetstra T C (en: U, T) 12:01, 22 October 2015 (UTC)Reply
I see... much more complicated in practice than I thought. My idea was to apply it to a wider class of false positives, including the '<prefix>-' rule and more, by replacing "\b" with a regex rule which covers all and only the unreserved URI characters (upper & lowercase letters, decimal digits, hyphen, underscore, and tilde; with "dots" used in practice as delimiters). But this wouldn't cover the '<blah>-something.com' examples you gave, and having read some of the maintenance thread below which covers false positives, I won't try to press the issue! Maybe one day? Until then, I hope this goes well! Cheers for your work! --YodinT 12:26, 22 October 2015 (UTC)Reply
@Yodin: If the foundation finally decides that it is time to solve some old bugzilla requests (over other developments which sometimes find fierce opposition), and among those the ones regarding overhaul of the spam-blacklist system, then this would be nice 'feature requests' of that overhaul. In a way, stripping down the edit-filter to pure regex matching 'per rule', with some other options added (having a regex being applied to one page or set of pages; having the regex being excluded on one page only, having the whitelist requests being added to the blacklist rule they affect, whitelisting on one page or set of pages, etc. etc.) would be a great improvement to this system. --Dirk Beetstra T C (en: U, T) 14:25, 23 October 2015 (UTC)Reply
Closed Closed nothing to do, a block at enWP, nothing global.  — billinghurst sDrewth 09:45, 22 November 2015 (UTC)Reply

non-ascii are not blocked?



I saw \bказино-форум\.рф\b in the page, so it's supposed to be blocked. However, I can link it: http://казино-форум.рф It seems like all non-ascii links will be able to avoid blocking.

In Thai Wikipedia (where I am an admin), there are a lot of Thai URLs that we want to put them in the local blacklist but we couldn't because of the very same reason. --Nullzero (talk) 17:42, 18 February 2016 (UTC)Reply

This should go to Phab: quickly - that is a real issue. --Dirk Beetstra T C (en: U, T) 05:52, 21 February 2016 (UTC)Reply
@Beetstra: Please see Phab:T28332. It seems that you need to put \xd0\xba\xd0\xb0\xd0\xb7\xd0\xb8\xd0\xbd\xd0\xbe-\xd1\x84\xd0\xbe\xd1\x80\xd1\x83\xd0\xbc\.\xd1\x80\xd1\x84 (without \b) instead of \bказино-форум\.рф\b --Nullzero (talk) 20:00, 21 February 2016 (UTC)Reply
*sigh* somehow the workaround doesn't work with Thai characters, so I don't know if \xd0\xba\xd0\xb0\xd0\xb7\xd0\xb8\xd0\xbd\xd0\xbe-\xd1\x84\xd0\xbe\xd1\x80\xd1\x83\xd0\xbc\.\xd1\x80\xd1\x84 will actually work or not. Please try it anyway... --Nullzero (talk) 20:24, 21 February 2016 (UTC)Reply

Free domain names









.ml, .ga, .cf, and .gq offer free domain names [1]. I'm sick of playing whack-a-mole with the TV show spam; is there anything else we can do? MER-C (talk) 13:38, 8 April 2016 (UTC)Reply

@MER-C: could easily be blacklisted, provided that there is not too much regular material that needs to be linked on those sites. What countries do these belong to? --Dirk Beetstra T C (en: U, T) 06:38, 20 April 2016 (UTC)Reply
For .ml we have 1581 links on en.wikipedia. It looks the majority of that are .org.ml and .gov.ml and similar (many used as references). --Dirk Beetstra T C (en: U, T) 06:41, 20 April 2016 (UTC)Reply
.gq looks like low hanging fruit: 21 links on en.wp, only half of which are in mainspace. MER-C (talk) 12:29, 2 May 2016 (UTC)Reply
Less than half I would say, however some of those are 'official' (I see the registrar itself, and a university). Moreover, this blocks more than only en.wikipedia (though a quick check on other wikis does not enlarge the set of genuine links too much). If we write the rules so that the (large majority of the) currently used 'good' subdomains (on, say, the 5 major wikis) are exluded, I'll pull the trigger. --Dirk Beetstra T C (en: U, T) 13:11, 2 May 2016 (UTC)Reply
@MER-C: are we getting much spam outside of enWP? If the spam is centred on enWP, can we try the local blacklist there initially?  — billinghurst sDrewth 05:11, 23 May 2016 (UTC)Reply
The situation is mostly under control on enWP -- we now have a private abuse filter in front of the spam blacklist which works fairly well. They've now turned to spamming via facebook, which is something I struggle to care about. Blocking these domains isn't necessary at this moment, but one sees parallels with the .tk and .co.nr situation. MER-C (talk) 07:10, 27 May 2016 (UTC)Reply

google and springer together



I do not understand regular expressions at all. On the Finnish Wikipedia, the following link was blacklisted:

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0ahUKEwj2y_u_p4zMAhWHKCwKHXfCDAgQFggkMAE
&url=http://www.springer.com/cda/content/document/cda_downloaddocument/9781441983893-c1.pdf
?SGWID%3D0-0-45-1149139-p174086675&usg=AFQjCNHKA4W5amgbAZGZXCgD5ZQy5tplQw&cad=rja 

Any ideas why? The IP in question has reported the problem on our local admin noticeboard, but I cannot help them. --Pxos (talk) 20:17, 13 April 2016 (UTC)Reply

Pxos (talk · contribs), they should link directly to the Springer website. Google redirect links include link tracking. John Vandenberg (talk) 10:11, 19 April 2016 (UTC)Reply
@Pxos: - use http://www.springer.com/cda/content/document/cda_downloaddocument/9781441983893-c1.pdf - this link is copied from the search-result page of google, it is not the actual link to the document. --Dirk Beetstra T C (en: U, T) 06:37, 20 April 2016 (UTC)Reply
Closed Closed direct links usable, redirecting links blocked.  — billinghurst sDrewth 15:08, 23 May 2016 (UTC)Reply

Escaping dot in myrtlebeach.com regex

  • [a-z]myrtlebeach.com\b

Should the . at the beginning of .com be escaped? – JonathanCross (talk) 15:01, 25 April 2016 (UTC)Reply

@Billinghurst: looks like you added the regex in this revision. It was based on User:COIBot/XWiki/websitedesigninmyrtlebeach.com which suggests escaping the dot. – JonathanCross (talk) 14:49, 14 May 2016 (UTC)Reply

The regex has been removed. Not sure there is any ongoing issue unless we get spam again.  — billinghurst sDrewth 03:45, 15 May 2016 (UTC)Reply
Ah, great, thanks! – JonathanCross (talk) 15:31, 15 May 2016 (UTC)Reply

Discussion

This section is for discussion of Spam blacklist issues among other users.

Expert maintenance

One (soon) archived and rejected removal suggestion was about jxlalk.com matched by a filter intended to block xlalk.com. One user suggested that this side-effect might be as it should be, another user suggested that regular expressions are unable to distinguish these cases, and nobody has a clue when and why xlalk.com was blocked. I suggest to find an expert maintainer for this list, and to remove all blocks older than 2010. The bots identifying abuse will restore still needed ancient blocks soon enough, hopefully without any oogle matching google cases. –Be..anyone (talk) 00:50, 20 January 2015 (UTC)Reply

No, removing some of the old rules, before 2010 or even before 2007, will result in further abuse, some of the rules are intentionally wide as to stop a wide range of spamming behaviour, and as I have argued as well, I have 2 cases on my en.wikipedia list where companies have been spamming for over 7 years, have some of their domains blacklisted, and are still actively spamming related domains. Every single removal should be considered on a case-by-case basis. --Dirk Beetstra T C (en: U, T) 03:42, 20 January 2015 (UTC)Reply
Just to give an example to this - redirect sites have been, and are, actively abused to circumvent the blacklist. Some of those were added before the arbitrary date of 2010. We are not going to remove those under the blanket of 'having been added before 2010', they will stay blacklisted. Some other domains are of similar gravity that they should never be removed. How are you, reasonably, going to filter out the rules that never should be removed. --Dirk Beetstra T C (en: U, T) 03:52, 20 January 2015 (UTC)Reply
By the way, you say ".. intended to block xlalk.com .." .. how do you know? --Dirk Beetstra T C (en: U, T) 03:46, 20 January 2015 (UTC)Reply
I know that nobody would block icrosoft.com if what they mean is microsoft.com, or vice versa. It's no shame to have no clue about regular expressions, a deficit we apparently share.:tongue:Be..anyone (talk) 06:14, 20 January 2015 (UTC)Reply
I am not sure what you are referring to - I am not native in regex, but proficient enough. The rule was added to block, at least, xlale.com and xlalu.com (if it were ONLY these two, \bxlal(u|e)\.com\b or \bxlal[ue]\.com\b would have been sufficient, but it is impossible to find this far back what all was spammed, possibly xlali.com, xlalabc.com and abcxlale.com were abused by these proxy-spammers. --Dirk Beetstra T C (en: U, T) 08:50, 20 January 2015 (UTC)Reply
xlalk.com may have been one of the cases, but one rule that was blacklisted before this blanket was imposed was 'xlale.com' (xlale.com rule was removed in a cleanout-session, after the blanket was added). --Dirk Beetstra T C (en: U, T) 04:45, 20 January 2015 (UTC)Reply
The dots in administrative domains and DNS mean something, notably foo.bar.example is typically related to an administrative bar.example domain (ignoring well-known exceptions like co.uk etc., Mozilla+SURBL have lists for this), while foobar.example has nothing to do with bar.example. –Be..anyone (talk) 06:23, 20 January 2015 (UTC)Reply
I know, but I am not sure how this relates to this suggested cleanup. --Dirk Beetstra T C (en: U, T) 08:50, 20 January 2015 (UTC)Reply
If your suggested clean-ups at some point don't match jxlalk.com the request by a Chinese user would be satisfied—as noted all I found out is a VirusTotal "clean", it could be still a spam site if it ever was a spam site.
The regexp could begin with "optionally any string ending with a dot" or similar before xlalk. There are "host name" RFCs (LDH: letter digit hyphen) up to IDNAbis (i18n domains), they might contain recipes. –Be..anyone (talk) 16:56, 20 January 2015 (UTC)Reply
What suggested cleanups? I am not suggesting any cleanup or blanket removal of old rules. --Dirk Beetstra T C (en: U, T) 03:50, 21 January 2015 (UTC)Reply
Of course I'm not sure. There is no issue of bad faith. He had reason to use regex, for two sites, and possibly suspected additional minor changes would be made. But he only cited two sites. One of the pages was deleted, and has IP evidence on it, apparently, which might lead to other evidence from other pages, including cross-wiki. But the blacklistings themselves were clearly based on enwiki spam and nothing else was mentioned. This blacklist was the enwiki blacklist at that time. After enwiki got its own blacklist, the admin who blacklisted here attempted to remove all his listings. This is really old and likely obsolete stuff. --Abd (talk) 20:07, 21 January 2015 (UTC)Reply
3 at least. And we do not have to present a full case for blacklisting (we often don't, per en:WP:BEANS and sometimes privacy concerns), we have to show sufficient abuse that needs to be stopped. And if that deleted page was mentioned, then certainly there was reason to believe that there were cross-wiki concerns.
Obsolete, how do you know? Did you go through the cross-wiki logs of what was attempted to be spammed? Do you know how often some of the people active here are still blacklisting spambots using open proxies? Please stop with these sweeping statements until you have fully searched for all evidence. 'After enwiki got its own blacklist, the admin who blacklisted here attempted to remove all his listings.' - no, that was not what happened. --Dirk Beetstra T C (en: U, T) 03:16, 22 January 2015 (UTC)Reply
Hi!
I searched all the logs (Special:Log/spamblacklist) of several wikis using the regexp entry /xlal[0-9a-z-]*\.com/.
There were almost no hits:
w:ca: 0
w:ceb: 0
w:de: 0
w:en: 1: 20131030185954, xlalliance.com
w:es: 1: 20140917232510, xlalibre.com
w:fr: 0
w:it: 0
w:ja: 0
w:nl: 0
w:no: 0
w:pl: 0
w:pt: 0
w:ru: 0
w:sv: 0
w:uk: 0
w:vi: 0
w:war: 0
w:zh: 1: 20150107083744, www.jxlalk.com
So there was just one single hit at w:en (not even in the main namespace, but in the user namespace), one in w:es, and one in w:zh (probably a false positive). So I agree with user:Abd that removing of this entry from the sbl would be the best solution. -- seth (talk) 18:47, 21 February 2015 (UTC)Reply
Finally an argument based on evidence (these logs should be public, not admin-only - can we have something like this in a search-engine, this may come in handy in some cases!). Consider removed. --Dirk Beetstra T C (en: U, T) 06:59, 22 February 2015 (UTC)Reply
By the way, Seth, this is actually no hits - all three you show here are collateral. Thanks for this evidence, this information would be useful on more occasions to make an informed decision (also, vide infra). --Dirk Beetstra T C (en: U, T) 07:25, 22 February 2015 (UTC)Reply
I am not sure that we want the Special page to be public, though I can see some value in being able to have something at ToolLabs to be available to run queries, or something available to be run through quarry.  — billinghurst sDrewth 10:57, 22 February 2015 (UTC)Reply
Why not public? There is no reason to hide this, this is not BLP or COPYVIO sensitive information in 99.99% of the hits. The chance that this is non-public information is just as big as for certain blocks to be BLP violations (and those are visible) ... --Dirk Beetstra T C (en: U, T) 04:40, 23 February 2015 (UTC)Reply

Now restarting the original debate

As the blacklist is long, and likely contains rules that are too wide a net and which are so old that they are utterly obsolete (or even, may be giving collateral damage on a regular basis), can we see whether we can set up some criteria (that can be 'bot tested'):

  1. Rule added > 5 years ago.
  2. All hits (determined on a significant number of wikis), over the last 2 years (for now: since the beginning of the log = ~1.5 years) are collateral damage - NO real hits.
  3. Site is not a redirect site (should not be removed, even if not abused), is not a known phishing/malware site (to protect others), or a true copyright violating site. (this is hard to bot-test, we may need s.o. to look over the list, take out the obvious ones).

We can make some mistakes on old rules if they are not abused (remove some that actually fail #3) - if they become a nuisance/problem again, we will see them again, and they can be speedily re-added .. thoughts? --Dirk Beetstra T C (en: U, T) 07:25, 22 February 2015 (UTC)Reply

@@Hoo man: you have worked on clean up before, some of your thoughts would be welcomed.  — billinghurst sDrewth 10:53, 22 February 2015 (UTC)Reply
Doing this kind of clean up is rather hard to automatize. What might be working better for starters could be removing rules that didn't match anything since we started logging hits. That would presumably cut down the whole blacklist considerably. After that we could re-evaluate the rest of the blacklist, maybe following the steps outlined above. - Hoo man (talk) 13:33, 22 February 2015 (UTC)Reply
Not hitting anything is dangerous .. there are likely some somewhat obscure redirect sites on it which may not have been attempted to be abused (though, also those could be re-added). But we could do test-runs easily - just save a cleaned up copy of the blacklist elsewhere, and diff them against the current list, and see what would get removed.
Man, I want this showing up in the RC-feeds, then LiWa3 could store them in the database (and follow redirects to show what people wanted to link to ..). --Dirk Beetstra T C (en: U, T) 03:30, 23 February 2015 (UTC)Reply
Hi!
I created a table of hits of blocked link additions. Maybe it's of use for the discussion: User:lustiger_seth/sbl_log_stats (1,8 MB wiki table).
I'd appreciate, if we deleted old entries. -- seth (talk) 22:12, 26 February 2015 (UTC)Reply
Hi, thank you for this, it gives a reasonable idea. Do you know if the rule-hits were all 'correct' (for those that do show that they were hit) or mainly/all false-positives (if they are false-positive hitting, we could based on this also decide to tighten the rule to avoid the false-positives). Rules with all-0 (can you include a 'total' score) would certainly be candidates for removal (though still determine first whether they are 'old' and/or are nono-sites before removal). I am also concerned that this is not including other wikifarms - some sites may be problematic on other wikifarms, or hitting a large number of smaller wikis (which have less control due to low admin numbers). --Dirk Beetstra T C (en: U, T) 03:36, 8 March 2015 (UTC)Reply
Hi!
We probably can't get information of false positives automatically. I added a 'sum' column.
Small wikis: If you give me a list of the relevant ones, I can create another list. -- seth (talk) 10:57, 8 March 2015 (UTC)Reply
Thanks for the sum-column. Regarding the false-positives, it would be nice to be able to quickly see what actually got blocked by a certain rule, I agree that that then needs a manual inspection, but the actual number of rules with zero hits on the intended stuff to be blocked is likely way bigger than what we see.
How would you define the relevant small wikis - that is depending on the link that was spammed? Probably the best is to parse all ~750 wiki's, make a list of rules with 0 hits, and a separate list of rules with <10 hits (and including there the links that were blocked), and exclude everything above that. Then these resulting rules should be filtered by those which were added >5 years ago. That narrows down the list for now, and after a check for obvious no-no links, those could almost be blanket-removed (just excluding the ones with real hits, the obvious redirect sites and others - which needs a manual check). --Dirk Beetstra T C (en: U, T) 06:59, 9 March 2015 (UTC)Reply
Hi!
At User:Lustiger_seth/sbl_log_stats/all_wikis_no_hits there's a list containing ~10k entries that never triggered the sbl during 2013-sep and 2015-feb anywhere (if my algorithm is correct).
If you want to get all entries older than 5 years, then it should be sufficent to use only the entries in that list until (and including) \bbudgetgardening\.co\.uk\b.
So we could delete ~5766 entries. What do think? Shall we give it a try? -- seth (talk) 17:06, 18 April 2015 (UTC)Reply
The question is, how many of those are still existing redirect sites etc. Checking 5800 is quite a job. On the other hand, with LiWa3/COIBot detecting - it is quite easy to re-add them. --Dirk Beetstra T C (en: U, T) 19:28, 21 April 2015 (UTC)Reply
According to the last few lines, I've removed 124kB of non-hitting entries now. I did not remove all of them, because some were url shorteners and I guess, that they are a special case, even if not used yet. -- seth (talk) 22:25, 16 September 2015 (UTC)Reply

Blacklisting spam URLs used in references

Looks like a site is using the "references" section as a spam farm. If a site is added to this list, can the blacklist block the spam site? Raysonho (talk) 17:45, 5 September 2015 (UTC)Reply

Yes they can.--AldNonymousBicara? 21:56, 5 September 2015 (UTC)Reply
Thanks, Aldnonymous! Raysonho (talk) 00:07, 6 September 2015 (UTC)Reply

url shorteners

Hi!
IMHO the url shorteners should be grouped in one section, because they are a special group of urls that need a special treatment. A url shortener should not be removed from sbl unless the domain is dead, even if it has not been used for spamming, right? -- seth (talk) 22:11, 28 September 2015 (UTC)Reply

That would be beneficial to have them in a section. Problem is, most of them are added by script, and are hence just put at the bottom. --Dirk Beetstra T C (en: U, T) 04:51, 4 October 2015 (UTC)Reply
Maybe it would seem more preferable to have "spam blacklist" be a compilation file, made of files one of which would be "spam blacklist.shorteners"  — billinghurst sDrewth 12:15, 24 December 2015 (UTC)Reply
This seems like a nice idea. Would certainly help with cleaning up of it (which we don't do nowadays). IIRC, it is technically possible to have different spam blacklist pages so this is technically possible, just needs a agreement among us and someone to do it. --Glaisher (talk) 12:17, 24 December 2015 (UTC)Reply

@Beetstra, Lustiger seth, Glaisher, Vituzzu, MarcoAurelio, Hoo man, and Legoktm: and others. What are your thoughts on a concatenation of files as described above. If we have a level of agreement, then we can work out the means to an outcome.  — billinghurst sDrewth 12:39, 25 January 2016 (UTC)Reply

  • I am somewhat in favour of this - split the list into a couple of sublists - one for url-shorteners, one for 'general terms' (mainly at the top of the list currently), and the regular list. It would however need an adaptation of the blacklist script (I've done something similar for en.wikipedia (a choice of blacklisting or revertlisting for each link), I could give that hack a try here, time permitting). AFAIK the extension in the software is capable of handling this. Also, it would be beneficial for the cleanout work, that the blacklist itself is 'sectioned' into years. Although being 8 years old is by no means a reason to expect that the spammers are not here anymore (I have two cases on en.wikipedia that are older than that), we do tend to be more lenient with the old stuff. (on the other hand .. why bother .. the benefits are mostly on our side so we don't accidentally remove stuff that should be solved by other means). --Dirk Beetstra T C (en: U, T) 13:05, 25 January 2016 (UTC)Reply
Is it really possible to have different spam blacklist pages? What would happen to the sites that use this very list to block unwanted spam? —MarcoAurelio 14:23, 25 January 2016 (UTC)Reply
It is technically possible. But this would mean that if we move all the URL shortener entries to a new page, all sites using it currently would have to update the extension or explicitly add the new blacklist to their config or these links would be allowed on their sites (and notifying all these wikis about this breaking change is next to impossible). Another issue I see is that a new blacklist file means there would be a separate network request on cache miss so their might be a little delay in page saves (but I'm not sure whether this delay would be a noticeable delay). --Glaisher (talk) 15:38, 25 January 2016 (UTC)Reply
Hi!
Before we activate such a feature, we should update some scripts that don't know anything about sbl subpages yet.
Apart from that I don't think that a sectioning into years would be of much use. One can use the (manual) log for this. A subject-oriented sectioning could be of more use, but this would also be more difficult for us. -- seth (talk) 20:49, 27 January 2016 (UTC)Reply

Unreadable

Why is the list not alphabetical, so I can look up whether a certain site is listed and then also look up when it was added? --Corriebertus (talk) 08:55, 21 October 2015 (UTC)Reply

hi!
there are advantages and disadvantages of a alphabetical list. for example it would be very helpful to group all url shorteners at one place (see discussion thread above). sometimes it's better to have a chronological list. additionally to that regexp can't be really sorted domain-alphabetically.
if you want search the blacklist, you can use a tool like https://tools.wmflabs.org/searchsbl/. -- seth (talk) 17:16, 30 October 2015 (UTC)Reply