Jump to content

Talk:Spam blacklist: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Latest comment: 12 years ago by PiRSquared17 in topic Proposed additions
Content deleted Content added
Line 202: Line 202:


:{{Added}}. --[[User:Trijnstel|<font color="#064EA3" face="Verdana" size="2">Trijnstel</font>]]<sub>[[User talk:Trijnstel|<font color="#000000">talk</font>]]</sub> 22:13, 27 March 2012 (UTC)
:{{Added}}. --[[User:Trijnstel|<font color="#064EA3" face="Verdana" size="2">Trijnstel</font>]]<sub>[[User talk:Trijnstel|<font color="#000000">talk</font>]]</sub> 22:13, 27 March 2012 (UTC)

=== alturl.com related domains ===
<s>{{tlx|LinkSummary|2=alturl.com}}</s> already added

<s>{{tlx|LinkSummary|2=shorturl.com}}</s> already added
{{LinkSummary|2ya.com}}
{{LinkSummary|vze.com}}
{{LinkSummary|24ex.com}}
{{LinkSummary|hitart.com}}
{{LinkSummary|mirrorz.com}}
{{LinkSummary|filetap.com}}
<s>{{tlx|LinkSummary|2=funurl.com}}</s> already added
{{LinkSummary|dealtap.com}}
{{LinkSummary|bigbig.com}}
{{LinkSummary|ebored.com}}
{{LinkSummary|hereweb.com}}
{{LinkSummary|1sta.com}}
{{LinkSummary|echoz.com}}
{{LinkSummary|2truth.com}}
{{LinkSummary|2fortune.com}}
{{LinkSummary|2hell.com}}
{{LinkSummary|2tunes.com}}
{{LinkSummary|2savvy.com}}
{{LinkSummary|2fear.com}}
{{LinkSummary|2freedom.com}}
{{LinkSummary|antiblog.com}}
Series of URL shorteners. Maybe you can simplify with \b2(hell|truth|fortune|tunes|savvy|fear|freedom)\.com instead of listing them all.

[[User:PiRSquared17|<b style="color:#f90;font-family:Arial">πr<sup>2</sup></b>]] ([[User talk:PiRSquared17|<i style="color:#0f3;font-family:Arial">'''t'''</i>]] • [[Special:Contributions/PiRSquared17|<i style="color:#03f;font-family:Arial">'''c'''</i>]]) 00:27, 28 March 2012 (UTC)


== Proposed additions (Bot reported) ==
== Proposed additions (Bot reported) ==

Revision as of 00:28, 28 March 2012

Shortcut:
WM:SPAM
WM:SBL
The associated page is used by the MediaWiki Spam Blacklist extension, and lists regular expressions which cannot be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any meta administrator can edit the spam blacklist; either manually or with SBHandler. For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.
Proposed additions
Please provide evidence of spamming on several wikis. Spam that only affects a single project should go to that project's local blacklist. Exceptions include malicious domains and URL redirector/shortener services. Please follow this format. Please check back after submitting your report, there could be questions regarding your request.
Proposed removals
Please check our list of requests which repeatedly get declined. Typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects. Please consider whether requesting whitelisting on a specific wiki for a specific use is more appropriate - that is very often the case.
Other discussion
Troubleshooting and problems - If there is an error in the blacklist (i.e. a regex error) which is causing problems, please raise the issue here.
Discussion - Meta-discussion concerning the operation of the blacklist and related pages, and communication among the spam blacklist team.
#wikimedia-external-linksconnect - Real-time IRC chat for co-ordination of activities related to maintenance of the blacklist.

Please sign your posts with ~~~~ after your comment. This leaves a signature and timestamp so conversations are easier to follow.


Completed requests are marked as {{added}}/{{removed}} or {{declined}}, and are generally archived (search) quickly. Additions and removals are logged · current log 2024/07.

snippet for logging
{{sbl-log|3599089#{{subst:anchorencode:SectionNameHere}}}}


Proposed additions

This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users on multiple wikis. Completed requests will be marked as {{added}} or {{declined}} and archived.

Google redirect spam

Note : This section won't be automatically archived by the bot



Specifically 'google.com/url?'

See http://en.wikipedia.org/w/index.php?title=Wikipedia_talk:External_links&oldid=456669797#Google_redirection_URLs

Explanation:

The first result reads:

[PDF]Public Law 105-298

www.copyright.gov/legislation/pl105-298.pdf

File Format: PDF/Adobe Acrobat - Quick View

PUBLIC LAW 105–298—OCT. 27, 1998. Public Law 105–298. 105th Congress. An

Act. To amend the provisions of title 17, United States Code, with respect to ...

If you right-click on the bolded name of the first result (on 'Public Law 105-298'), and copy the url, you get:

  • http:// www.google.com/url?sa=t&rct=j&q=public%20law%20105-298&source=web&cd=1&ved=0CB4QFjAA&url=http%3A%2F%2Fwww.copyright.gov%2Flegislation%2Fpl105-298.pdf&ei=vmahTvikEoib-gadiZGuBQ&usg=AFQjCNH95AzJoEKz83KrtpLkLXENeJ3Njw&sig2=I_64kGBITluwmGNvw619Cg

Which is how these URL's end up here, and which can be used to circumvent the blacklist. --Dirk Beetstra T C (en: U, T) 13:02, 21 October 2011 (UTC)Reply

(which are all three meta blacklisted sites). --Dirk Beetstra T C (en: U, T) 13:08, 21 October 2011 (UTC)Reply

(unarchived)
I need some help here, please. This is apparently a problem for all tld's of google. See en:WT:EL#Google_redirection_URLs.
'google.*?\/url\?' ??
--Dirk Beetstra T C (en: U, T) 10:04, 24 October 2011 (UTC)Reply
maybe this is a bug in the extension? "redtube.com" was a part of the url you tested here. -- 86.159.93.124 17:36, 24 October 2011 (UTC) (seth, not logged in) -- seth 20:17, 1 November 2011 (UTC)Reply
That is what also occurred to me .. anyway it needs blocking as redirect sites should not be used - even when the target is not blacklisted (yet). --Dirk Beetstra T C (en: U, T) 08:29, 25 October 2011 (UTC)Reply
Guys, calm down. This is blocking a very small number of links (a couple of hundreds), not the whole of Google. Many regular editors are NOT going to include these links. Normal google links do NOT include the /url? part, there is no need to link there, and like with the other google loophole (which was abused), this is waiting to be abused (if it has not yet been abused). This is not 'making pages impossible to edit' - it makes it impossible to ADD a link, this is not 'screw[ing] with lots of pages' (as I said, just a couple of hundred), bots can't solve this (if it is used to circumvent blacklisting, then the bot can't repair the link anyway), etc. etc. Have a look at what I have been suggesting and what the problem actually is before making such sweeping comments. Thanks. --Dirk Beetstra T C (en: U, T) 07:06, 26 October 2011 (UTC)Reply
Yes, I already realized that you were not blocking all of Google. I made my above objections fulling knowing the exact scope of this blacklisting, and still stand by the fact that this solution is overkill and causes more problems than it solves. I will concur that this eliminates the problem you note. It is not, however, a proper solution in that it also prevents good uses of the google.com/url linking. There are perfectly valid methods to stop this abuse, as noted above someone is already working out a bot solution. The issue here, Beetstra, isn't that you have solved a problem, its that you have refused to consider alternate solutions which could have far less collateral damage. Your attitude of "I have done this, and you all have to just live with the negative consequences because that's that" isn't terribly helpful. People here have suggested, and are working on, a way to fix this problem in softer ways, and it would be beneficial to try these before merely deciding that your solution is final and cannot be reconsidered, merely because you decided to do it. --Jayron32 15:02, 26 October 2011 (UTC)Reply
What, exactly, would be an example of "good uses of the google.com/url linking"? Anomie 20:24, 26 October 2011 (UTC)Reply
As a quick note, there's really no "good uses" - any use of this link can be seamlessly replaced by a link to the target URL. I don't believe it's been used for any significant amount of use to avoid the blacklist, but that isn't my major concern - having these URLs as external links means that any time a reader follows them, we're handing off some amount of their reading history to Google, which is a definite contravention of the spirit of the privacy policy if not the letter of it. Shimgray 21:34, 26 October 2011 (UTC)Reply
Jayron32 - that is a pretty blunt statement that you make. You blatantly say that I did not consider other methods of stopping this. First, there is no single reason why to link to a google/url? link. They are redirects, you can link to the real link. Your argument is just saying that there are also good reasons to link to bit.ly or any other redirect site - there is NONE.
Regarding other solutions, I considered:
  • The AbuseFilter - which clearly should be cross-wiki one, since this is a cross-wiki issue
    • Flagging only - as if a spammer would care, they just save (but well, at least people may notice)
    • Blocking - which is just the same as the blacklist.
  • XLinkBot - currently only activated for en.wikipedia.
But as I said elsewhere and here again - this simply should never be linked, there is never a reason. And what other solutions did you have in mind? --Dirk Beetstra T C (en: U, T) 09:11, 27 October 2011 (UTC)Reply
  • (EC) Concur with blacklist, my only suggestion if it's a real problem to user is lift the block for a short time to give time for bots to be readied for all projects. I'm not sure but it sounds like some people may be confused. For clarity Google is not blacklisted. You can still link to google.com itself or google search results like [1]. What is blacklisted is www.google.com/url? . The reason is because this functions as a redirect. I can't see any reason why they should ever be on wikipedia (they are simple redirects, they don't allow you to view the cache or something if the page is down), they mostly happen by accident when people copy the links of Google search results. They add another point of failure (Google) and also may lead to confusion (people thinking the site they're going to is Google and so trustworthy, see for example the previous mentioned search results) and also mean people are forced to go through Google to visit the external link (allowing Google to collect their data). However as made clear here, the primary reason they were blocked is because they can be abused, as anyone can use them to link to spam sites overiding the blacklist. Nil Einne 07:11, 26 October 2011 (UTC)Reply

Unarchived again. Still needs to be solved. --Dirk Beetstra T C (en: U, T) 09:52, 1 November 2011 (UTC)Reply

I am going to change the rule to 'google\.[^?#]*\/url\?'. --Dirk Beetstra T C (en: U, T) 11:12, 1 November 2011 (UTC)Reply

Needed to use '\bgoogle\..*?\/url\?' - '\bgoogle\.[^?#]*\/url\?' was not accepted by the blacklist. Testing if other Google links still work: http://www.google.com/search?hl=en&q=Google+Arbitrary+URL+Redirect+Vulnerability. --Dirk Beetstra T C (en: U, T) 11:18, 1 November 2011 (UTC)Reply

Try '\bgoogle\.[^?\x23]*\/url\?', it's choking on trying to interpret the literal "#" character as the start of a comment. But escaped it works fine on my local test installation of MediaWiki. Note that '\bgoogle\..*?\/url\?' will block a URL like http://www.google.com/search?q=Google+/url?+Redirect, as unlikely as that is to occur. Anomie 14:25, 1 November 2011 (UTC)Reply
Hi!
what about \bgoogle\.[a-z]{2,4}/url\?? -- seth 16:01, 1 November 2011 (UTC)Reply
That wouldn't catch domains like google.com.au, or paths like http://www.google.com/m/url?.... Anomie 17:05, 1 November 2011 (UTC)Reply
hmm, ok. So which urls have to be blocked exactly? What is this google.com/m/-thing? If these were the only exceptions \bgoogle(?:\.com)?\.[a-z]{2,4}(?:/m)?/url\? would do.
The Abuse Filter could be a helping compromise, but it still can't be used globally, am I right? Did anybody open a ticket at bugzilla already? -- seth 20:17, 1 November 2011 (UTC)Reply
Basically, what needs to be caught are all google urls (all tlds) where the path ends in /url? - the normal form would hence be 'google.com/url?', but also 'google.com.au', and 'google.at/url?' - and long forms are e.g 'google.<tld>/archivesearch/url?' For a full list of links that have been added (but it does not necessarily have to be exhaustive, there may be even more possible) see the post of Anomie in en:Wikipedia_talk:EL#Google_redirection_URLs.
A global filter may be an idea as an alternative, but if it is set to blocking it will have the same effect anyway (though could be more specific since the message could be made informative for specific redirects and how to avoid them) - if set to notify it is probably futile when people start to abuse it (except that we would then notice). There simply is no need to have it, just follow the link (which I hope one needs to do anyway since I hope that people read the document they want to link to), and copy it then from the address bar of your browser. --Dirk Beetstra T C (en: U, T) 08:56, 2 November 2011 (UTC)Reply
Hi!
I see a big advantage in blocking urls with adapted messages, so that users can modify their link without being surprised about alleged spamming. However, there is still no global AF, is it?
I opened a ticket now: bugzilla:32159. -- seth 22:45, 2 November 2011 (UTC)Reply
(unarchived) -- seth 08:42, 5 November 2011 (UTC)Reply
The sbl extension searches for /https?:\/\/+[a-z0-9_\-.]*(\bexample\.com\b). That means our sbl entries always start with a domain part of a (full) url. That's ok because those google-links also include full urls. The problem is that those urls are encoded (see w:en:Percent-encoding) and the sbl extension does no decoding. So ...?url=http%3A%2F%2Fwww.example.com is not resolved as ...?url=http://www.example.com. Solutions could be
1. start the regexp pattern not with /https?:\/\/+[a-z0-9_\-.]*/ but with /https?(?i::|%3a)(?i:\/|%2f){2,}[a-z0-9_\-.]*/ or
2. decode urls before using the regexp matching. -- seth 11:35, 5 November 2011 (UTC)Reply
don't archive this. -- seth 21:09, 7 November 2011 (UTC)Reply
Sorry for the problems with the archive bot. Now it should be resolved, please just remove the first template of this section when you will want this request to be archived. Regards, -- Quentinv57 (talk) 18:00, 10 November 2011 (UTC)Reply
thx! :-) -- seth 21:26, 10 November 2011 (UTC)Reply

Note, that also when the blacklist would catch the links which redirect to blacklisted domains, this domain should still be blacklisted as it is still inappropriate, and can be used to avoid detection by our bots. Also, it unnecessary involves google in your linking, and not everyone may be interested in having their data being analysed by Google. --Dirk Beetstra T C (en: U, T) 08:20, 11 November 2011 (UTC)Reply

  • If you say that these links can be restated to avoid blocking, you should EXPLAIN HOW THIS IS DONE, in VERY SIMPLE LANGUAGE in a box at the top here. Most users are not techies. I have no idea how to do it. Otherwise the block should be removed. Johnbod 15:30, 11 November 2011 (UTC)Reply
I wrote a small stupid tool tools:~seth/google_url_converter.cgi which can be used to recover the original urls from the google redirects. -- seth 15:45, 13 November 2011 (UTC)Reply
Johnbod - As goes for practically all redirect sites - follow the link, and copy/paste the url from the address bar of your browser. Don't copy/paste the url that Google is giving you.
To explain it further - the Google search gives you a set of google-redirects which point to the correct websites. You then click one of the redirects from Google, so Google knows that that is the result that is most interesting to you. Next time you search something similar, it will think, that that is the result of interest to you, so you it will get a higher ranking - what, it may also show up higher in rankings on searches by other people, since you thought it was more interesting. Now, as such, that is not a big issue - but if you use that google-redirect on Wikipedia, the Google rankings of that page get improved through Wikipedia. That is a loophole waiting to be abused. It is the very, very essense of Search Engine Optimisation. It is even more efficient than having your website itself on Wikipedia. --Dirk Beetstra T C (en: U, T) 10:49, 15 November 2011 (UTC)Reply
I agree with Beetstra. But it's not always that easy to get the original url, if you want to link an excel-file for example (see w:de:WP:SBL). That's why I created the small tool. -- seth 22:24, 17 November 2011 (UTC)Reply

Also, if you want to avoid this problem and you use Firefox, you can install this extension. MER-C 09:52, 21 November 2011 (UTC)Reply

If... I recall correctly, this kind of loophole can be detecting looking for "usg=" in the url, instead of "url=". es:Magister Mathematicae 15:29, 18 December 2011 (UTC)Reply

I see the point of blacklisting these adresses, but there is some kind of technical problem in this case. Normally you are allowed to keep the url that already exists on pages, but not in this case. I would like to edit the pages: sv:Bengt Nordenskiöld and sv:Who Says, but I cannot without removing the already present link. Why? -- Lavallen (talk) 11:25, 4 March 2012 (UTC)Reply

That is not correct. Blacklisting prevents the page being saved, it is a yes/no test at the time saving. Maybe you are confusing it with AbuseFilter behaviour. billinghurst sDrewth 11:29, 4 March 2012 (UTC)Reply

mag4you.com/spotlight/Javeria+Abbasi/10532.htm



This site was just used as a source, when I went to check it I got a threat warning. Darkness Shines 06:18, 2 January 2012 (UTC)Reply

adding it to the blacklist will just mean that it cannot be added, however, it will not remove it
Comment Comment Avast throws warning http://oltrafficstatserver.com/ad_track...

canliradyo-dinle.com



Cross-wiki spam/page creations. πr2 (tc) 21:27, 25 February 2012 (UTC)Reply

With these .tr websites, I find that they add once, and then disappear. To this point I found a reversion quite successful. billinghurst sDrewth 22:53, 25 February 2012 (UTC)Reply
 Declined. All pages are deleted now and blacklisting doesn't seem necessary anymore. --Trijnsteltalk 12:44, 25 March 2012 (UTC)Reply

imagesfrombulgaria.com



Accounts

Mass spamming of multiple language wiki's--Hu12 (talk) 18:31, 7 March 2012 (UTC)Reply

IP-address globally blocked for cross-wiki linkspam. -- Tegel (Talk) 18:38, 7 March 2012 (UTC)Reply
Thanks--Hu12 (talk) 16:58, 8 March 2012 (UTC)Reply
See also w:en:WP:AN/I#Request_for_restore_external_link_and_block_of_a_user (permanent link to current state of discussion). --Dirk Beetstra T C (en: U, T) 14:27, 14 March 2012 (UTC)Reply


Referral spam



Added Added. --Courcelles 14:50, 26 March 2012 (UTC)Reply

u.to



URL shortener (used on a few wikis). πr2 (tc) 14:49, 27 March 2012 (UTC)Reply

Added Added Snowolf How can I help? 15:38, 27 March 2012 (UTC)Reply

urla.ru



URL shortener (was used to evade global sbl on ruwp). πr2 (tc) 22:12, 27 March 2012 (UTC)Reply

Added Added. --Trijnsteltalk 22:13, 27 March 2012 (UTC)Reply

{{LinkSummary|alturl.com}} already added

{{LinkSummary|shorturl.com}} already added













{{LinkSummary|funurl.com}} already added





























Series of URL shorteners. Maybe you can simplify with \b2(hell|truth|fortune|tunes|savvy|fear|freedom)\.com instead of listing them all.

πr2 (tc) 00:27, 28 March 2012 (UTC)Reply

Proposed additions (Bot reported)

This section is for domains which have been added to multiple wikis as observed by a bot.

These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see Spam blacklist/Help#COIBot_reports. Reports will automatically be archived by the bot when they get stale (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is COIBot).

Sysops
  • If the report contains links to less than 5 wikis, then only add it when it is really spam
  • Otherwise just revert the link-additions, and close the report; closed reports will be reopened when spamming continues
  • To close a report, change the LinkStatus template to closed ({{LinkStatus|closed}})
  • Please place any notes in the discussion section below the HTML comment

The LinkWatchers report domains meeting the following criteria:

  • When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
  • When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
  • If ALL links are added by IPs, and the link is added to more than 1 wiki
  • If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.
COIBot's currently open XWiki reports
List Last update By Site IP R Last user Last link addition User Link User - Link User - Link - Wikis Link - Wikis
vrsystems.ru 2023-06-27 15:51:16 COIBot 195.24.68.17 192.36.57.94
193.46.56.178
194.71.126.227
93.99.104.93
2070-01-01 05:00:00 4 4

Proposed removals

This section is for proposing that a website be unlisted; please add new entries at the bottom of the section.

Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{removed}} or {{declined}} and archived.

See also /recurring requests for repeatedly proposed (and refused) removals.

The addition or removal of a domain from the blacklist is not a vote; please do not bold the first words in statements.

xuarez.comoj.com

I'm the owner of the site xuarez.comoj.com It's a recent site and it's part of a research project of Valladolid University. There's no malicious software, there's no spam or publicity in the site I'm a professor and the web is part of our university work. We had the same problem with SiteAdvisor of McAfee. With them the problem was there was a site called comoj.com with malicious software, but there's no relation betwen comoj.com and xuarez.comoj.com, our site. McAfee has rectified. You can see our request here: https://community.mcafee.com/thread/44163. Please remove our web of black list. Thank you for your attention Xuarez project.


bet-at-home.com

Note : This section won't be automatically archived by the bot



Was added to blacklist 2007 because of this edit, today the company have articles on cs, de, en, hu and pt. I think blacklisting could be removed... Greets --AleXXw 11:37, 21 November 2011 (UTC)Reply

Do note that all the articles were created by single-purpose accounts. Seen the way that that is done on many wikis, I would consider their goal still to 'promote their company'. --Dirk Beetstra T C (en: U, T) 11:48, 21 November 2011 (UTC)Reply
I noticed that, but at least at de.wp the entry is relevant (there was an deletion request in 2007 decided to keep) and edited by some other users... I think its not useful to have an article for an internet-company and not be able to link to their homepage ;) greets --AleXXw 12:03, 21 November 2011 (UTC)Reply
To that I agree, but that does not necessarily mean de-listing (there is always the whitelist to list something suitable). For en.wikipedia, I found the article pretty much primary sourced (and the secondary sources were more for statements like 'they sponsored this event'). I found the current entries on other Wikis similar (I'll have a read through the German article as well). --Dirk Beetstra T C (en: U, T) 12:25, 21 November 2011 (UTC)Reply
Note: the current version on en.wikipedia seems a straight translation of the current German version (which was rewritten not too long ago). Both versions have as a first secondary source a reference for 'they sponsored this' - overall that seems quite thin for notability. --Dirk Beetstra T C (en: U, T) 13:28, 21 November 2011 (UTC)Reply
I know it was written shortly, I was "Mentor" (sth like "adopt a user") of the writer. I agree to your point, but I don't think notability should be discussed here. And I still not see why one added Link into a nearly matching article can create an alltime-blacklist-entry, but this shall not be my problem ;) greets --AleXXw 22:35, 21 November 2011 (UTC)Reply
"And I still not see why one added Link into a nearly matching article can create an alltime-blacklist-entry" .. You did not notice the large set of sockpuppets who have a similar modus operandi now? And that one edit was just an example of more, that link, and a set of others, was clearly spammed in the past. I am sorry, I see editors out of that sockfarm (with a large COI appearance) create articles of questionable notability on several wikis, and then we are asked to de-list to facilitate that?
And please note, I did not decline. --Dirk Beetstra T C (en: U, T) 12:47, 22 November 2011 (UTC)Reply
No, I did not noticed it right now, I just wanted to add the webpage of a webcompany to its article... It is notable, at least on de.wp :) What is COI? Sorry for my bad english... Greets --AleXXw 16:37, 23 November 2011 (UTC)Reply
w:Wikipedia:Conflict of Interest, I would think that there is a similar article at a WP site in a language that is familiar to you if you follow the interwiki links from that page.

That a local language article does not have the url of its site may be considered unfortunate, however, the language wiki can manage that through the whitelist to circumvent a global ban. billinghurst sDrewth 21:11, 23 November 2011 (UTC)Reply

Thx, I just didn't know the abbreviation. I'll try a whitelistentry on de.wp. Greets --AleXXw 22:57, 23 November 2011 (UTC)Reply
I saw over the past weeks several additions of links that redirect to bet-at-home.com. I have a feeling this company is actively spamming wikipedia with articles. I do feel this company lacks notability, but this is not the place for that discussion. I suggest we ask the wikipedia community of they see notability. We can then delist if this comapny is notable. EdBever 14:03, 26 November 2011 (UTC)Reply
Hi!
I whitelisted the domain at w:de temporary, so that it could be linked in the article about itself. I removed the whitelisting afterwards, so that the meta-block is active again to prevent spamming. -- seth 12:20, 4 December 2011 (UTC)Reply
 Declined at this time as there has been no further support for removal of blacklist billinghurst sDrewth 15:54, 18 December 2011 (UTC)Reply
Nonetheless I guess temp unblocking could be useful to let authors use those links in articles about the domain, e.g. w:de:bet-at-home.com. -- seth 19:29, 18 December 2011 (UTC)Reply

I have written the article de and en. For pt, cs and hu I worked together with a mother-tongue speaker. This was the reason why we opened a new account in the special language Wikipedia and not the reason “promoting the company”. My adopter told me that spamming was 2006-2007 and maybe from a person from ex-Yugoslavia. I don’t know who this person is. But I am writing the articles from Austria. Due to the fact that my aim was to write an article which compares to all Wikipedia guidelines, I asked in every language where an adopter program exists, an adopter to help us. Therefore I can guarantee that I am not willing to spam with the article. It makes no sense for me because I only would like to have an actual article for bet-at-home.com. Because the company is international I would like to translate the same article from the German Wikipedia also to other language. The languages compares to the markets where the company is working in. Therefore I would be pleased if the link www.bet-at-home.com could be deleted from the global blacklist so that it would be possible for us to have the url of the site in the articles. --Bah2011 06:18, 19 December 2011 (UTC)Reply

There is nothing currently prohibiting the writing of the articles, just the insertion of the url. billinghurst sDrewth 07:14, 19 December 2011 (UTC)Reply
Yes I know that I cannot use the url in the articles. And this is my problem. Is there some possibility to change this situation? What has to be done to delete the url from the blacklist?--Bah2011 07:43, 19 December 2011 (UTC)Reply
Hi!
user Bah2011 contacted me via e-mail a few days ago. And I'm quite sure, that this user is not going to spam.
Of course Bah2011 could go to every local sbl and ask for whitelisting (like at w:de), such that links to bet-at-home.com could be added to articles about bet-at-home.com. But that would be unnecessarily complicated. So a temporary global unblocking is the least thing we could and should do. -- seth 22:45, 19 December 2011 (UTC)Reply
I have issues <-> concerns about the interest that seems somewhere between vested and conflict, even indicated by the username. While the contributor may not spam, it offers a level of control for individual wikis to watch and manage a previously problematic url, especially I don't feel that there should be an perception of an imprimatur given where the notability discussion which is being relied upon (mentioned above) at enWP was a "no consensus" decision, not a definite decision for notability. Being involved in the discussion, I am not making any decision. billinghurst sDrewth 15:12, 20 December 2011 (UTC)Reply
Hi!
I agree with EdBever who said "We can then delist if this comapny is notable."
It's not us who have to decise what is notable and what is not. As we can see, all articles about bet-at-home.com (at cs, de, en, hu and pt) are still existing. That means that bet-at-home.com is notable enough.
Now it's our (admins) duty to make it technically possible for the users to place links to the website the wiki articles are about. So at least the temp unblacklisting must be done.
The only thing we have to discuss about is whether it could be reasonable to even permanently remove the entry from the blacklist.
The domain is blacklisted for a couple of years now, so imho we could give it a try. -- seth 21:52, 20 December 2011 (UTC)Reply
unblocked bet-at-home.com (at least temp). after 7 days (or if Bah2011 tells here, that all needed links are placed, whatever comes first) we can decide here, whether blacklisting is still necessary. -- seth 18:45, 28 December 2011 (UTC)Reply
Hi! All links are placed now. As mentioned before, the spamming was 2006-2007 and maybe from a person from ex-Yugoslavia. The aim of this articles is not to spam Wikipedia! Therefore I would be grateful if you could remove bet-at-home.com from the blacklist. Thanks!--Bah2011 08:21, 30 December 2011 (UTC)Reply
The temp unblocking seemed to be a success. Now the remaining question is: what reasons are there to re-activate the blacklisting? -- seth 20:50, 3 January 2012 (UTC)Reply

I firmly disagree with how this is now progressing. For now there is maybe no reason to re-list it, but I do think that there is a promotional thought behind all of this - the (single purpose sock) accounts all to clear have a conflict of interest, their interest is not solely to improve Wikipedia, they mainly focus on this site and its appearance on Wikipedia. Do note, that I think that de-blacklisting - linking - reblacklisting as a method is asking for problems. A specific link should be found that points to a homepage (e.g. an index.html) and for each wiki a whitelist rule should be added that enables solely that link (and still should only be on the page where it is intended) and then that link should be used on the pages (and that is what I did suggest above). Every time now that one of these pages on one of these wikis gets significantly vandalised (in a way that breaks the link) it would be impossible to revert (OK, here we maybe do not re-blacklist). This also is a way around local discussions on all wikis whether a link and/or article is really needed on that wiki. Moreover, I think there was not a clear consensus for removal, and now a temporary removal is turned into a permanent removal. I am afraid that this is setting a bad precedent, next time it will be an SEO asking for de-listing so that they can spam the company, and when we decline they can point to this discussion. Please, get the whitelisting in place on all wikis, that is why we have whitelists, or get a proper consensus for de-listing (something that I would not necessarily be against, though I do have concerns, but do get proper consensus for de-listing). --Dirk Beetstra T C (en: U, T) 20:34, 8 January 2012 (UTC)Reply

I re-read the discussions above, and I see that sDrewth and EdBever have similar concerns as I have, while AleXXw and Lutiger seth seem to have an opposite view (which IMHO is a great reason to whitelist it locally, not to de-blacklist). Seen also that the editor used a redirect (since the official place trips the blacklist) and has a conflict of interest does make me come to the conclusion that this needs a better discussion for de-blacklisting. I have hence undone the removal that Lustiger seth carried out a couple of days ago. --Dirk Beetstra T C (en: U, T) 20:44, 8 January 2012 (UTC)Reply

I only can say again that I’ve worked together with mother-tongue speakers. This was the reason why we opened new accounts in the special language Wikipedia and not the reason “promoting the company”. The aim was to actualize the old article and to translate the article in other languages because the company is international. When I actualized the article I mentioned that the website is on the blacklist and therefore I had problems when I prepared the article. This was a reason why I asked for re-blacklist. --Bah2011 06:41, 9 January 2012 (UTC)Reply
This is the perspective that I am seeing. We have an editor who is taking interest in a single company, across multiple languages, with no evident previous background, nor edit history anywhere; has a name that aligns with the product in which they are writing. The articles don't exist cross-wiki apart from where this editor has started, despite them having a reputed notability. The editor ignores or dismisses commentary about the surrounding aspects of their specific interest, and does not state their reason for focusing on the subject. The focus of the discussion is solely on writing the article and their working with those who have the language skills.

Call me a cynic, but I don't buy it. Part of the role at meta is to be on the lookout for people linking cross-wiki one url and exhibiting a conflict of interest. If it was a humanitarian organisation, I could see why someone could have the passion to do that, for a business in this business sector, I don't buy it. There are not multiple people/communities writing the articles nor expressing interest in the article, there is not. The statement was that the domain url has been spammed, and that is usually a pay for fee process, not a whimsical matter, and if that the organisation on the blacklist at that time, those are the consequences of that action. I believe that I see self-interest, not the interest of the projects. In my opinion, get a whitelist at the wikis if you can, ensure that you link to this discussion when you make the request, as I doubt that when the matter was previously raised that you clearly expressed that you were single article focused crosswiki. If I was investigating motive, I would be suspecting a paid professional writer, or a sock. That sounds like an opinion and that clearly rules me out of assessing the balance of the argument. billinghurst sDrewth 10:50, 9 January 2012 (UTC)Reply

I agree fully with billinghurst so  Declined. No valid reason to remove and local whitelisting is available if the community require it. --Herby talk thyme 11:08, 9 January 2012 (UTC)Reply

@Bah2011. On en.wikipedia I have expressed concerns as to the notability of the subject (I nominated it for deletion), and seen the article, I believe that it still lacks sufficient references to give it notability (most of the independent references state something like 'it was sponsored by bet-at-home.com' .. that is about as much as there is. So, start a company, sponsor something, people will write that you sponsored it, and you are notable? No, it does not work that way IMHO). Moreover, the domain got originally blacklisted because of promotion, and now these pages are created/edited, IMHO that is still because of promotion. I do not buy anything else. If you get linked and found on the internet, it is because of good SEO, not because of proven notability (where are reviews that compare bet-at-home.com with other online betting companies, etc. etc. - are they there? do they exist?). I am sorry, Bah2011, IMHO you are only here to promote bet-at-home.com. That was the case when it was originally blacklisted, and that is still the case. --Dirk Beetstra T C (en: U, T) 13:48, 9 January 2012 (UTC)Reply
I agree in that point that Bah2011 probably has got self-interest. But I also see that this users aim is, to write articles that totally fulfill our rules. And as we can see, this user doesn't do a bad job. At the RfD at w:en there was no consensus for deletion. Bah2011 wrote the article in five wikipedias, and not a single one of those articles were deleted. So the subject is notable. (Or am I wrong?)
There had been some (not really much) spamming of this domain back in 2007. That's more than 4 years ago. How long shall a link be blacklisted? 100 years? Even if the article about the url exists?
One suggestion to user Bah2011 was get a whitelist at the wikis if you can. I already set the domain on the whitelist at w:de, temporarily, s.t. the link could be placed in the article. Of course that user can do that in every single wikipedia, where a article shall be created. But it's senseless to have an url blacklisted globally and multi-whitelisted locally. Afair we unblocked an url, if it got whitelisted in two big wikipedias. -- seth 17:06, 14 January 2012 (UTC)Reply
Seth, yes, there was a suggestion to whitelist, which IMHO should be a start - and that was done. That that happens on 2 wikis does already suggest that the link may be ripe for de-listing. And I did initially not decline, actually, I did not decline anywhere. Others were also not very positive, and some have declined delisting - at that time certainly there was no consensus in favor of delisting.
Noting the whitelisting, I see you said that you whitelisted it on de.wikipedia, added the link, and then de-whitelisted again. The common practice on en.wikipedia is to whitelist a index.htm, index.html, or even an about.htm specifically for use as 'official homepage' - although that does not prohibit further spamming of the homepage on that wiki, it does prohibit the use of other pages on the same site (pages that IIRC were used in the original spamming). Someone who seriously vandalises the page will still make the original unsaveable, and an admin may have to go again through the same process. That is not the function of the whitelist.
And I agree, in 4 years a lot can change, companies can change to serious, notable companies. Serious requests are indeed often granted, but those were not arguments given at any stage in the delisting request. Do note, that several editors here do think that the notability is thin, very thin (but notable nonetheless).
What I disagreed with, and why I did re-list is that you then go ahead with a temporary delisting, and then after a couple of days unilaterally decide that it is going to be kept off the list. I still think that that is setting a bad precedent, and goes against the non-consensus for delisting. Several editors have given their concerns, which means that we need to get to consensus before a permanent delisting should be performed. To enable for that discussion, I have re-listed awaiting that.
Regarding delisting, seen that the original spamming was 4 years ago, and that the company does seem notable enough for articles, I will again not decline de-listing, but would like to see additional arguments. I do still have concerns that this is clever SEO of a not-too-notable company. --Dirk Beetstra T C (en: U, T) 19:07, 14 January 2012 (UTC)Reply

Comment Comment at English Wikpedia, the article for deletion process closed as no consensus which should be considered differently as keep and having achieved notability. billinghurst sDrewth 23:26, 14 January 2012 (UTC)Reply

pedigreedogsexposed.blogspot.com

Presumably blocked because of the "dogsex" sequence in the URL (which actually stands for "Pedigree Dogs Exposed"), this link is quite useful to illustrate some points in discussions and therefore should be unblocked. --Cú Faoil 10:50, 1 January 2012 (UTC)Reply

Dogsex is not on the blacklist. I am not sure what does trigger the blacklist, but I do not feel like looking for all instances of sex on the list. I suggest you request local whitelisting for this website if you really want to add it to an article. EdBever 19:21, 1 January 2012 (UTC)Reply
It's a site related to a movie generated quite some reactions (see en:Pedigree Dogs Exposed and interwiki) that is maintained by the director of that movie, so I think it would actually be quite useful to be able to link to this globally. When I try entering the URL, the output is that "pedigreedogsex" triggered the spam filter. --Cú Faoil 23:52, 1 January 2012 (UTC)Reply
Then it may just be in the blacklist at enWP. If it is in their blacklist, then you will need to ask there;, if you want it in their whitelist, you will need to ask there. w:en:Wikipedia:Administrators' noticeboard billinghurst sDrewth 23:57, 1 January 2012 (UTC)Reply
Hi!
You can use the tool http://toolserver.org/~seth/grep_regexp_from_url.cgi to check, where (and why) a link is blacklisted. In this case, "dogsex" is on the meta blacklist. I'll modify the regexps in the next few minutes, s.t. pedigreedogsexposed.blogspot.com will be linkable. -- seth 20:58, 3 January 2012 (UTC)Reply

outlandishtr.com



This page is a site which supports the music band called Outlandish in Turkey and broadcas.And hope you can remove this site from blacklist —The preceding unsigned comment was added by 139.179.199.36 (talk) 13:07, 13 January 2012‎

It is a fansite, less authoritative than a reputed news site, at the same time, the User:COIBot/XWiki/outlandishtr.com indicates that it is only on two wikis, which would usually mean that it should be handled locally rather than at meta. I would prefer that this was handled locally by the enWP/trWP communities than the overarching list. All that said, there does seem to be some overlinking, and I would encourage to limit any link addition to the main article page, rather than wider adding of the url through multiple pages. billinghurst sDrewth 11:08, 14 January 2012 (UTC)Reply

www.shanghairanking.com



This is a source corresponding with values found in http://en.wikipedia.org/wiki/Template:Infobox_US_university_ranking. The source corresponding with the values seems to be allowed in many if not most US university articles on en.wikipedia.org, but is apparently blocked in some or a few, including http://en.wikipedia.org/wiki/Carnegie_Mellon_University. --81.100.44.233 18:47, 15 January 2012 (UTC)Reply

It does look to be a somewhat problematic link, and enWP's use of tools to manage some of the linking is further indicative of its misuse. Also 263 links on 21 projects would indicate that it is acceptable, though no Xwiki report makes the analysis a little more difficult. Probably should be removed and watched, and may reappear in the blacklist if it is again being abused. billinghurst sDrewth 00:08, 16 January 2012 (UTC)Reply
Just do it the old way, billinghurst. If you look at the editors mentioned in the LinkReport linked from the tracking template, I see many IPs adding this to many wikis. That looks to me like it is en:WP:REFSPAM (I see occasions where there are two references for a statement, and then a third to 'shanghairanking.com' is added to it - shanghairanking was not used to write the statement, I will assume the other two were - but those are not the links under discussion in this thread at least). I will have a better look later. Thanks. --Dirk Beetstra T C (en: U, T) 03:31, 16 January 2012 (UTC)Reply
Why was this ever put on the blacklist? Academic Ranking of World Universities is probably the most influential world ranking. It is the one referred to regulalry by The Economist, one of the most influential newsmagazines in the world. It is an impeccable source for university rankings, and as such will be linked from university templates and university articles regularly. I don't see how that qualifies as spam, it's not like people are trying to push the consultancy services, they are linking to university rankings. Alternatively, can we whitelist the site at en:wiki to override the Meta blacklisting? Franamax (talk) 04:31, 24 February 2012 (UTC)Reply
Looking at the files it occurred in December diff though as it there as \bshanghairanking\b. There is no commentary around the issue. billinghurst sDrewth 04:53, 24 February 2012 (UTC)Reply
yeah, good question. Why. I do see this, but that is just a little drop on the whole plate. Only if the editor would go into excessive, uncontrollable socking that may be a bit of a reason to do this. I see it is also on XLinkBot's revertlist, which does suggest that someone had a vested interest to have this stuff linked. But I don't know, we will have to wait for Quentinv57 for explanation.
Do note, it is not shanghairanking.com that is blacklisted, but 'shanghairanking'. Maybe something else was the base for this?
I have removed the rule, it is likely too wide or mistaken. --Dirk Beetstra T C (en: U, T) 04:58, 24 February 2012 (UTC)Reply
Ah. More: Special:Contributions/Shanghairanking2011. Maybe the socks spamming the domain have triggered this. We'll need other methods to convince the socks. --Dirk Beetstra T C (en: U, T) 05:06, 24 February 2012 (UTC)Reply
Remove Remove just formalising the previous removal undertaken billinghurst sDrewth 03:04, 5 March 2012 (UTC)Reply

Troubleshooting and problems

This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is not the section to request that an entry be unlisted (see Proposed removals above).

x.co



The current filter entry is too strict, as it even blocks urls containing this string which is a frequent one. For example www.san-x.co.jp is blocked, which doesn't make any sense. --Mps 21:07, 22 January 2012 (UTC)Reply

Done fixed as per seth's previous lookbehind regex. Thanks for taking the time to post here and to tell us about this matter. billinghurst sDrewth 00:39, 23 January 2012 (UTC)Reply

cjb





For some reason every cjb.net-Website is blocked. Somebody wanted to add the site http://hateplow.cjb.net/ and failed. Maybe it's beacuse of the \bcjb\.net\b entry, but I'm no expert. --Gripweed 09:25, 27 January 2012 (UTC)Reply

It is a url shortener/redirect. Look at http://do73i.cjb.net billinghurst sDrewth 09:38, 27 January 2012 (UTC)Reply
There are a couple of possible solutions:
  • You can just use another (not blacklisted) link to the same page: http://www.arcticmusicgroup.com/hateplow.
  • If hateplow.cjb.net is mentioned much more often than www.arcticmusicgroup.com/hateplow, it's possible to unblacklist this special domain
    • locally at w:de or
    • globally her at meta, by using a special syntax (zero-width negative look-behind assertions)
I suggest, using the www.arcticmusicgroup.com-link would be the best solution. -- seth 16:25, 27 January 2012 (UTC)Reply
Thanks... Didn't know the url-shortener-thing --Gripweed 20:28, 27 January 2012 (UTC)Reply
Not done then. Trijnstel 14:06, 28 January 2012 (UTC)Reply

pump.pp4l.me



The site was added by Vituzzu (talk · contribs) but it still displays on en.wp. It Is Me Here t / c 13:24, 19 February 2012 (UTC)Reply

The blacklist will not remove a link already in place, though it should prevent the continued addition. Remove it from the article and all should be good. billinghurst sDrewth 13:38, 19 February 2012 (UTC)Reply
 Declined Not done nothing to do billinghurst sDrewth 22:47, 25 February 2012 (UTC)Reply

Discussion

This section is for discussion of Spam blacklist issues among other users.

Thinking aloud really

I am well aware of policy regarding this page however I do wonder whether making a point of listing all (almost all) spambot placed links might not be such a bad idea? The worst that is likely to happen is a howl of protest from site owners who presumably are directly or indirectly behind the placement of the links. I'd sure there will be other views but... I'll spam a few folks pages in case they miss this :) --Herby talk thyme 15:31, 7 March 2012 (UTC)Reply

That's a practical suggestion, however, the current limitations of the spam blacklist make this less of a good idea. Legitimate links to the spammed sites could be blocked, and this extension gives no record of actions it takes. Perhaps using the global AbuseFilter for this if/when it is enabled? Ajraddatz (Talk) 15:43, 7 March 2012 (UTC)Reply
Afaik google already takes care of our spamblacklist, publicizing this side effect of SEO on Wiki would be a good idea, but I have two concerns:
  • Spam is a war: a SEO expert (actually a bit smarter than those which are attacking WMF's wikis) could use us to destroy the rank of a competitor
  • A big part of spamming consists of google bombs, apparently meaningless texts which can influence google search results.
So, to me, a wall of shame for spammers could be a good idea, but the main point is making mediawiki less xrumer-friendly, e. g. enforcing captchas and implementing a global abusefilter or many filters on several wikis. --Vituzzu (talk) 21:37, 7 March 2012 (UTC)Reply
Fair points both and thanks for contributing. I guess we will go on the hard way for now :) --Herby talk thyme 16:33, 9 March 2012 (UTC)Reply
Hi!
Vituzzu said: Afaik google already takes care of our spamblacklist. Please give some evidence for this (iow: citation needed!). -- seth (talk) 15:40, 24 March 2012 (UTC)Reply