Talk:Spam blacklist: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Latest comment: 14 years ago by MER-C in topic Proposed additions
Content deleted Content added
→‎yalesimkin.com: elaboration
Line 128: Line 128:
{{usersummary|57.66.53.197}}
{{usersummary|57.66.53.197}}
I removed links from some wikipedia projects. [[User:Mosca|Mosca]] 15:48, 30 January 2010 (UTC)
I removed links from some wikipedia projects. [[User:Mosca|Mosca]] 15:48, 30 January 2010 (UTC)

===polimore.com===
{{spamlink|polimore.com}}

See [http://en.wikipedia.org/w/index.php?title=Wikipedia_talk:WikiProject_Spam&oldid=341226269#polimore.com] [[User:MER-C|MER-C]] 05:26, 1 February 2010 (UTC)


== Proposed additions (Bot reported) ==
== Proposed additions (Bot reported) ==

Revision as of 05:26, 1 February 2010

Shortcut:
WM:SPAM
WM:SBL
The associated page is used by the MediaWiki Spam Blacklist extension, and lists regular expressions which cannot be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any meta administrator can edit the spam blacklist. For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.
Proposed additions
Please provide evidence of spamming on several wikis. Spam that only affects a single project should go to that project's local blacklist. Exceptions include malicious domains and URL redirector/shortener services. Please follow this format. Please check back after submitting your report, there could be questions regarding your request.
Proposed removals
Please check our list of requests which repeatedly get declined. Typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects. Please consider whether requesting whitelisting on a specific wiki for a specific use is more appropriate - that is very often the case.
Other discussion
Troubleshooting and problems - If there is an error in the blacklist (i.e. a regex error) which is causing problems, please raise the issue here.
Discussion - Meta-discussion concerning the operation of the blacklist and related pages, and communication among the spam blacklist team.
#wikimedia-external-linksconnect - Real-time IRC chat for co-ordination of activities related to maintenance of the blacklist.

Please sign your posts with ~~~~ after your comment. This leaves a signature and timestamp so conversations are easier to follow.


Completed requests are marked as {{added}}/{{removed}} or {{declined}}, and are generally archived (search) quickly. Additions and removals are logged.

snippet for logging
{{sbl-log|1837128#{{subst:anchorencode:SectionNameHere}}}}

Proposed additions

This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users on multiple wikis. Completed requests will be marked as {{added}} or {{declined}} and archived.

co.cc again



MZMcBride removed this entry. Wikipedia claims that the domain is not a real TLD and is used for URL redirectors. On that basis, I think it should be re-added with a preceeding dot: \.co\.cc\b  — Mike.lifeguard | @en.wb 19:43, 11 November 2009 (UTC)Reply

Having had to deal with a lot of the spam links, I strongly endorse restoring this link. No offense, MZM, but this one should have been discussed first before removal. --Ckatz 10:05, 13 November 2009 (UTC)Reply
It's clearly not just being used for URL redirection. As the Wikipedia article notes, it can be used as a real DNS. .com is capable of URL redirection and brings in a lot more spam. I don't think blacklisting an entire TLD or ccTLD (real or not) is a good idea, though I can understand it's the simplest solution. Is there a complementary global whitelist? Do we have any idea how many false positives this addition to the blacklist will cause? --MZMcBride 10:42, 13 November 2009 (UTC)Reply
Do you have any evidence supporting your reason for removal? Discussion when working with others is critical, and your flippant response to my query is worrying.
As to the substantive issue: Yes, there will be candidates for whitelisting, that was acknowledged and addressed from the initial request for blacklisting. I haven't seen that the rate is unacceptable, which you simply take as a premise, and we have helped users to request whitelisting where necessary, and will continue to do so.  — Mike.lifeguard | @en.wb 14:56, 13 November 2009 (UTC)Reply
Flippant? You've globally blacklisted an entire ccTLD, which has broad implications on 700+ projects, plus an unknown number of sites that also use this list. This entry in particular is creating an unknown (and possibly high) number of false positives (I'm only here because there was a local problem at en.wiki regarding what appears to be an entirely valid URL and it was baffling how the URL could be blacklisted). Here's the diff of you broadening the regex—where was the discussion for doing this? I don't see anything in the log, though admittedly the log is nearly impossible to navigate. (If there is no discussion, what was the rationale? Is there supporting data to suggest that the only possible approach here is to block the entire ccTLD, an obviously extreme tactic?) --MZMcBride 16:33, 13 November 2009 (UTC)Reply
I think you missed this.  — Mike.lifeguard | @en.wb 16:36, 13 November 2009 (UTC)Reply
Are discussions on this talk page archived anywhere? I checked the log (silly me, I know). Reading the old discussion, I'm still baffled about the rationale here. It can be used for URL redirection. So can literally any other domain (top-level or otherwise). That's not an argument to ban any and all uses of it. If there's evidence that this domain is unmanageable and won't result in an excessive number of false positives, I don't have an issue with including such a broad regex. But I'd like there to be some specific data to point to, not just "can be used for URL redirection," which I consider a non-argument. --MZMcBride 16:42, 13 November 2009 (UTC)Reply
Not "can" -- "is" (well, "was" until you removed it :D). You can see User:COIBot/XWiki/co.cc for a small taste (too many results to generate the large taste) - or the original request. Anecdotally, yes, we know it was abused cross-wiki; that's why I added it when JzG brought the request here - if not it would have been "add to XLinkBot for enwiki, and we'll attempt to monitor on other wikis with COIBot.  — Mike.lifeguard | @en.wb 16:50, 13 November 2009 (UTC)Reply

(unindent) A question about User:COIBot/LinkReports/co.cc. How is the false positive ratio determined? It looks like the bot finds all instances of the domain (or part of a domain string) being added to a page, but are there are numbers regarding how many of these additions were legitimate? (There are legitimate uses of this ccTLD, right?) --MZMcBride 09:57, 15 November 2009 (UTC)Reply

I'm not sure what you mean by "false positive" in this context -- the bot cannot decide whether a link addition is appropriate or not since it's a bot.  — Mike.lifeguard | @en.wb 05:21, 2 December 2009 (UTC)Reply

After 2 months, do we have evidence of more spamming?  — Mike.lifeguard | @en.wb 21:40, 21 January 2010 (UTC)Reply

 Declined. No more evidence of spamming for two months, and blacklisting this can blacklist potentially good sites. On that basis, this is declined. --Pmlineditor  06:09, 26 January 2010 (UTC)Reply

khamagmongol.com etc.











A whole collection of those is persistently inserted in certain articles in enwiki, dewiki, and mnwiki, and ruwiki and others. Most if not all of them are inappropriate links for Wikipedia purposes (forums, unreliable sources, travel business promotion, etc.). They arrive through IPs from the following Russia-based ranges (plus those I didn't notice):

  • 85.26.164.0/24
  • 85.26.165.0/24
  • 85.26.232.0/24
  • 85.26.233.0/24

--Latebird 22:38, 19 January 2010 (UTC)Reply

Added Added. --Pmlineditor  06:21, 26 January 2010 (UTC)Reply

Massive crosswiki spammer





















IP got globally blocked as per request. Adding in few minutes. — Dferg (talk) 22:29, 25 January 2010 (UTC)Reply

Added Added — Dferg (talk) 22:35, 25 January 2010 (UTC)Reply

getlang.com



See COIBot report. MER-C 09:34, 27 January 2010 (UTC)Reply

Added Added — Dferg (talk) 13:45, 27 January 2010 (UTC)Reply

Adsense pub-3132917916465494

















See:

Same AdSense ID. MER-C 10:23, 28 January 2010 (UTC)Reply

Added Added. --Pmlineditor  05:27, 30 January 2010 (UTC)Reply

b2b-cb.com & b2b-club.ru









I removed links from some wikipedia projects. Mosca 15:48, 30 January 2010 (UTC)Reply

polimore.com



See [1] MER-C 05:26, 1 February 2010 (UTC)Reply

Proposed additions (Bot reported)

This section is for domains which have been added to multiple wikis as observed by a bot.

These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see Spam blacklist/Help#COIBot_reports. Reports will automatically be archived by the bot when they get stale (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is COIBot).

Sysops
  • If the report contains links to less than 5 wikis, then only add it when it is really spam
  • Otherwise just revert the link-additions, and close the report; closed reports will be reopened when spamming continues
  • To close a report, change the LinkStatus template to closed ({{LinkStatus|closed}})
  • Please place any notes in the discussion section below the HTML comment

COIBot

The LinkWatchers report domains meeting the following criteria:

  • When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
  • When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
  • If ALL links are added by IPs, and the link is added to more than 1 wiki
  • If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.
COIBot's currently open XWiki reports
List Last update By Site IP R Last user Last link addition User Link User - Link User - Link - Wikis Link - Wikis
vrsystems.ru 2023-06-27 15:51:16 COIBot 195.24.68.17 192.36.57.94
193.46.56.178
194.71.126.227
93.99.104.93
2070-01-01 05:00:00 4 4

Proposed removals

This section is for proposing that a website be unlisted; please add new entries at the bottom of the section.

Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{removed}} or {{declined}} and archived.

See also /recurring requests for repeatedly proposed (and refused) removals.

The addition or removal of a domain from the blacklist is not a vote; please do not bold the first words in statements.

cais-soas.com



I am not quite sure why the mentioned website is considered as a spam, this website has been used as a reliable resource for many historical and archeological subjects since 1998 as it holds its credibility from its connections with many accountable univesities such as University of London. The page that I needed (cais-soas.com/CAIS/History/hakhamaneshian/Cyrus-the-great/cyrus_cylinder.htm#Shapour Suren-Pahlav) is a reference for the Article of "Human Rights" (en.wikipedia.org/wiki/Human_rights); I will appreciate if an admin look into the website and make the necessary decisions about this educational resource to be removed from the spam list. Thank you. Armaiti (talk) 02:23, 22 January 2010 (UTC)Reply

I would be inclined to say no but we can discuss it, this was put on the list on in 2007 mostly because of this (a huge amount of copyright infringement). For reference the old discussions about removing it from 2008-07, 2007-11, 20007-08 and 2007-05. The original one in May 07 references an OTRS report, I do not know what happened to that and someone with access may want to check on that. I will note that looking at the website now it appears they have added a section on the copyright page that says that some content is reprinted with permission from other sources (including the sources and problems that were issues in the past) of course I wasn't able to link to the page like I just attempted to :/. Depending on the OTRS ticket and the history we may want to check to verify that? James (T C) 11:49, 22 January 2010 (UTC)Reply
I also had some interactions with the people adding the links, who were all associated with the site. So spamming was a problem as well as copyright. JzG 19:44, 24 January 2010 (UTC)Reply
I'm not sure this should be removed (& note that cais-soas.com has little credibility on copyright issues, after fraudulent GFDL notices). However, the entry seems to predate the wiki-specific blacklist. It could be moved to enwiki's blacklist, and monitored for abuse elsewhere. Once that's done, we'd remove it here.  — Mike.lifeguard | @en.wb 12:23, 26 January 2010 (UTC)Reply
If I recall correctly, this site carries many articles in violation of original copyrights, and is generally a non-reliable source (and it was abused). Knowingly linking to information which is in violation of copyrights is more a foundation problem, then a local wiki problem, and should be avoided.
As Hu12 recently put it on en.wikipedia: "This site was blocked at Meta after being identified as carrying images and content in violation of copyrights [2]. This site violates w:WP:Copyrights, Linking to copyrighted works. Linking to copyrighted works, Knowingly and intentionally directing others to a site that violates copyright has been considered a form of w:contributory infringement in the United States (w:Intellectual Reserve v. Utah Lighthouse Ministry [3]).[4]. Additionaly wikipedias servers are located in the United States, it's of no benefit, nor in wikipedias intrest to link this site." (diff).
I do indeed believe that it pre-dates the local blacklists, and that the problem was local, but seen the copyright problems, I would be inclined to leave it here, and request local whitelisting for those documents that are needed and do not impose a problem. --Dirk Beetstra T C (en: U, T) 12:48, 26 January 2010 (UTC)Reply
Yes, but the issue is whether there is, in fact, (still) an issue of copyright infringement. Looking at the tickets in OTRS, it isn't a clear issue. How recently has this been assessed? Until it's shown to be the case, I'd prefer to see this on enwiki's local blacklist.  — Mike.lifeguard | @en.wb 13:19, 26 January 2010 (UTC)Reply
True, you might want to ask Hu12 when he assessed that, but I'd err on the save side for now. I'd rather see proof that it changed before we change the situation. --Dirk Beetstra T C (en: U, T) 10:07, 27 January 2010 (UTC)Reply
  • Note - cais-soas com/CAIS/guideline_contribution.htm (Their inclusion standards) are not for academics or really have a trustworthy peer reviewed process, so there is little ability to say that the works are more reliable than, say, a Wikipedia article (and Wikipedia is not deemed a reliable source, after all). cais-soas com/CAIS/about_cais.htm (This) suggests they are no longer affiliated with a university, though was founded by two academics from a university. The lack of affiliation and the consistent attempt to promote the site through Wikipedia would suggest to me that there is more spam than academic benefit from this site. Ottava Rima 02:33, 1 February 2010 (UTC)Reply

israelnationalnews.com



I believe the site is inherently useful to provide news about what is going on in Israel. Israel is a tiny country but an important one in world affairs. IsraelNationalNews (Arutz Sheva) provides accurate news though it may sometimes be written from a right-of-center POV. (Like Fox News, perhaps?). It is useful its Israel news, its information with respect to Judaism, as well as for its opinion pieces where appropriate. It expresses the opinions of a number of Israelis as well as right of center Jews world-wide. [5] [6] [7] [8]

"The site reaches over 138K US monthly peoople, attracts a more educated, 50+ , rather male, mostly Caucasian following" In fact, quantcast has the stats here and notes that the audience for this site are also likely to visit ynetnews, National Review, Weekly Standard, Jerusalem Post, Washington Times, spiegel.de and the Wall Street Journal. According to the quantcast report, Israel National News "offers news, live radio broadcasts, political commentary, Arab press coverage and a video gallery. Also available in French, Russian and Hebrew" If it were a spam site, would a "more educated" people be likely to read it? This is not a spam site and I don't really understand why it is on this list. Thanks for consideration. Stellarkid 04:56, 26 January 2010 (UTC)Reply

Note) Poster has confused Israel National News (which is not blacklisted) with Israel News Agency (which is). See here for details. 71.231.168.41 06:12, 26 January 2010 (UTC)Reply
 Declined for now at least, this site is not blacklisted and haven't been shown real reason to remove the other. Feel free to ask again if needed. --James (T C) 06:01, 28 January 2010 (UTC)Reply

EU-Football.info



User Tommo on nl:wikipedia requested this link to be delisted[9]. He deems this a useful link. I had a look and the site contains a database of european footballmatches since 1872 so it might be valuable as a resource or reference. Since 12 links remain on 3 projects I think delisting is in order. The reason for blacklisting was that one user kept placing that link on various projects (en:, ru:, uk:). If this continues we should assess if blacklisting is the best option. I'd rather go for a (global) block of this user and if possible his IP-address. EdBever 10:46, 26 January 2010 (UTC)Reply

I must say I am inclined to accept this and take this off given that we have had a couple requests for removal. Looking at the original spam report it was definitely put on here for a reason but it was indeed all one person and 1. they may have given up and 2. it may be better to deal with that away from the blacklist. I will of course point out that you can remove it right away by adding it to nl:MediaWiki:Spam-whitelist but that is of course not always ideal. If we have enough people asking to use it (including Tommo who appears to be a fairly active user) it may be time to take it off and see what happens. Because of the previous denials and the fact that this would be my first removal I'm going to put this  On hold for now but if either they agree with me or don't answer my pings here or on IRC within a day or so I'll take it off :). James (T C) 11:04, 26 January 2010 (UTC)Reply
Removed Removed per the above.  — Mike.lifeguard | @en.wb 12:27, 26 January 2010 (UTC)Reply

yalesimkin.com



This domain was owned by the the linkspammer and vandal wayne Smith - http://en.wikipedia.org/wiki/Wikipedia:Long-term_abuse/Universe_Daily

Because I repeatedly reverted his edits, and blocked his vandalism at other sites, he created a domain with my name and pointed it to his foul racist and anti-American hate pages. He has since lost the domain and I own it now. Yale s 16:36, 26 January 2010 (UTC)Reply

The top of the page states: "Typically, we do not remove domains from the spam blacklist in response to site-owners' requests." Is there another reason to delist besides the fact that you now own it? However, you do appear to have many contribs as per the secondary statement. Just curious, but what would the link be used for if unblocked? For personal reasons? I ask because it is bare in content and seems to serve no function. Ottava Rima 02:45, 1 February 2010 (UTC)Reply

Troubleshooting and problems

This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is not the section to request that an entry be unlisted (see Proposed removals above).

None currently

Discussion

This section is for discussion of Spam blacklist issues among other users.

New regex behavour

Contrary to the warnings about ^ and $ we're probably all familiar with, those will soon match against the start & end of URLs. This change was introduced in rev:60869. I don't anticipate we should change old regexes, and this won't be extraordinarily useful for most use cases, but we should be aware of this for the future.  — Mike.lifeguard | @en.wb 17:02, 10 January 2010 (UTC)Reply

Hi!
Thanks for that information, but ^ will still be useless, because it would result in regexps like /http:\/\/^example.org/. This change affects $ only. -- seth 10:47, 31 January 2010 (UTC)Reply