Talk:Spam blacklist

From Meta, a Wikimedia project coordination wiki
(Redirected from WM:SBL)
Jump to: navigation, search
Requests and proposals Spam blacklist Archives (current)→
The associated page is used by the MediaWiki Spam Blacklist extension, and lists regular expressions which cannot be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any meta administrator can edit the spam blacklist; either manually or with SBHandler. For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.
Proposed additions
Please provide evidence of spamming on several wikis and prior blacklisting on at least one. Spam that only affects a single project should go to that project's local blacklist. Exceptions include malicious domains and URL redirector/shortener services. Please follow this format. Please check back after submitting your report, there could be questions regarding your request.
Proposed removals
Please check our list of requests which repeatedly get declined. Typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects. Please consider whether requesting whitelisting on a specific wiki for a specific use is more appropriate - that is very often the case.
Other discussion
Troubleshooting and problems - If there is an error in the blacklist (i.e. a regex error) which is causing problems, please raise the issue here.
Discussion - Meta-discussion concerning the operation of the blacklist and related pages, and communication among the spam blacklist team.
#wikimedia-external-linksconnect - Real-time IRC chat for co-ordination of activities related to maintenance of the blacklist.

Please sign your posts with ~~~~ after your comment. This leaves a signature and timestamp so conversations are easier to follow.


Completed requests are marked as {{added}}/{{removed}} or {{declined}}, and are generally archived quickly. Additions and removals are logged · current log 2014/07.

Projects
Information
List of all projects
Overviews
Reports
Wikimedia Embassy
Project portals
Country portals
Tools
Spam blacklist
Title blacklist
Vandalism reports
Closure of wikis
Interwiki map
Requests
Permissions
Bot flags
Logos
New languages
New projects
Username changes
Usurpation request
Translations
Speedy deletions
[edit]

snippet for logging
{{sbl-log|9290278#{{subst:anchorencode:SectionNameHere}}}}


Proposed additions[edit]

Symbol comment vote.svg This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users on multiple wikis. Completed requests will be marked as {{added}} or {{declined}} and archived.

French travel site[edit]









A couple variations are already blocked, and all resolve to 82.165.21.151.  — billinghurst sDrewth 01:30, 24 July 2014 (UTC)

Added Added -- — billinghurst sDrewth 01:30, 24 July 2014 (UTC)

Proposed additions (Bot reported)[edit]

Symbol comment vote.svg This section is for domains which have been added to multiple wikis as observed by a bot.

These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see Spam blacklist/Help#COIBot_reports. Reports will automatically be archived by the bot when they get stale (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is COIBot).

Sysops
  • If the report contains links to less than 5 wikis, then only add it when it is really spam
  • Otherwise just revert the link-additions, and close the report; closed reports will be reopened when spamming continues
  • To close a report, change the LinkStatus template to closed ({{LinkStatus|closed}})
  • Please place any notes in the discussion section below the HTML comment

COIBot[edit]

The LinkWatchers report domains meeting the following criteria:

  • When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
  • When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
  • If ALL links are added by IPs, and the link is added to more than 1 wiki
  • If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.
COIBot's currently open XWiki reports
List Last update By Site IP R Last user Last link addition User Link User - Link User - Link - Wikis Link - Wikis
eastcapetours.com 2014-07-24 11:23:57 COIBot 88.208.252.212 105.236.172.169
105.236.172.185
105.236.220.110
105.236.42.109
105.236.53.49
105.236.59.155
105.236.94.204
197.229.7.147
2014-07-24 10:37:38 11 2
facebook.pl 2014-07-24 06:47:33 COIBot 173.252.110.27 Papaduda
Toureobyd
193.239.144.218
89.229.173.181
2014-07-23 17:02:53 6 2
fansdesk.info 2014-07-24 11:36:03 COIBot 192.64.118.76 Fansdesks 2014-07-24 11:05:58 6 6 6 0 3
hk.k11.com 2014-07-24 11:24:58 COIBot 23.97.74.197 183.11.252.130
58.177.230.246
61.244.218.58
2014-07-24 10:47:51 6 2
jovenesestrellasdelpoker.es.tl 2014-07-24 12:08:49 COIBot 193.238.27.26 Wetburn 2014-07-24 11:36:26 6 6 6 0 2
judete.info 2014-07-24 08:28:04 COIBot 178.157.88.8 5.13.27.86
5.13.28.148
5.13.33.27
5.13.34.66
2014-07-24 07:54:33 6 2
omvarlden.se 2014-07-24 09:57:28 COIBot 217.114.87.186 195.178.246.9
85.226.145.132
94.234.170.180
2014-07-24 09:28:44 6 2
yuldig.yonsei.ac.kr 2014-07-24 03:38:14 COIBot 165.132.14.47 165.132.159.136
165.132.159.193
165.132.162.202
165.132.78.212
2014-07-24 03:21:30 8 2

Proposed removals[edit]

Symbol comment vote.svg This section is for proposing that a website be unlisted; please add new entries at the bottom of the section.

Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{removed}} or {{declined}} and archived.

See also /recurring requests for repeatedly proposed (and refused) removals.

Notes:

  • The addition or removal of a domain from the blacklist is not a vote; please do not bold the first words in statements.
  • This page is for the removal of domains from the global blacklist, not for removal of domains from the blacklists of individual wikis. For those requests please take your discussion to the pertinent wiki, where such requests would be made at Mediawiki talk:Spam-blacklist at that wiki. Search spamlists — remember to enter any relevant language code

Ascender Corporation[edit]





Ascender Corporation is a typeface foundry that was involved in the design of fonts like Droid, Liberation and several others. Their domains were presumably blacklisted in 2010 due to spamming, but it also prevents linking to them for sourcing. I don't see a reason to keep this global blacklist entry anymore. Don Cuan (talk) 11:03, 5 June 2014 (UTC)

Talk:Spam_blacklist&oldid=1999210#ascenderfonts.com, I am not adverse to delisting as it has been ~ four years, though would expect that we would monitor and put them back pretty quickly if it recurs. You can always ask for a local whitelisting at the wiki where you are trying to reference with the url.  — billinghurst sDrewth 04:01, 6 June 2014 (UTC)
Removed Removed  — billinghurst sDrewth 12:31, 23 July 2014 (UTC)

uservoice.com[edit]



Was added by User:Mike.lifeguard as a URL shortener. Unfortunately, UserVoice is not a URL shortener, but used as feedback collection to many large software products. This specially includes Microsoft products like Visual Studio and the .NET Framework, Windows Phone, and Microsoft Office and SharePoint.

The filter should be removed, so that Wikipedia can provide helpful links to the appropriate feedback sites within the articles of the given products.

MovGP0 (talk) 12:27, 12 June 2014 (UTC)

I am not sure that it is not open to abuse if it is removed. I would also think that it is quite possible for us to point back to each manufacturers site, and they have the ability to point onwards to their relevant pages. Remember that the WPs are not a directory listing but an encyclopaedia. If you wish to take it forward, I would think that trying for a whitelisting at a WP where you wish to add the links is the place to start. If the site has no issues with exploitation of the link then we can look to remove it from the blacklist, or you could look to the next WP.  — billinghurst sDrewth 13:12, 12 June 2014 (UTC)
Declined Declined inactive request, no follow-up from initial inquiry  — billinghurst sDrewth 12:33, 23 July 2014 (UTC)

cypress.com[edit]



I suggest taking cypress.com off the blacklist.

The Wikipedia:WP:ELYES guideline recommends that an article about some subject should link to the subject's official site, if any. And so the Wikipedia:Cypress Semiconductor article should link the official website of Cypress Semiconductor. (Their official website is cypress.com , right?)

Since the manufacturer of a chip is generally viewed as a reliable source of technical information about that chip, certain pages on that manufacturer's website are good references in articles like Wikipedia:List of common microcontrollers and Wikipedia:PSoC.

The cypress.com regex was apparently added to this blacklist 19 June 2010,[1] in response to a request on this talk page.[2]

--DavidCary (talk) 18:49, 23 June 2014 (UTC)

@DavidCary:, Hi. This is a mess. When the URL was blacklisted, it would have been standard process to first remove all the links, because a blacklisted link blocks editing a page containing a link, unless the user identifies the link and removes it. There is a link in the Wikipedia article, and it's been standing since before the blacklisting. The immediate fix, for Wikipedia, is to ask for local whitelisting, either of the entire domain or of specific pages. So I looked for history of this, and found that there will be, ah, kicking and screaming if one goes for the whole site:
  • removal request June 2011 on en.wiki while it was blacklisted here, so the requests were in the wrong place, see the comments from regular antispammers and links to spam reports.
  • removal request June 2013 on en.wiki ditto.
  • As to whitelist requests on en.wiki,[3], there have been six. The most recent request links to the others; it expired without action, which is common. One request was granted. It looks like nobody requested a whitelisting for the raw URL for the company web site, which probably would not be spammed. Beetstra? You've done these on enwiki before. This is an obvious legitimate usage for the company article.
  • the November 2013 request, denied
  • If you want to whitelist there, David, be aware that it can take a long time. However, if you place a complete request, showing a need for the link and not merely a desire, and nobody responds for, say, a week, you can go to w:WP:AN and ask for any admin to do it.
On one of the denied requests, what the requestor really wanted to do wasn't appropriate for Wikipedia, but would have been perfect for Wikiversity, and we get whitelistings there, on the occasions they are needed, usually in a day. I don't see that many Wikipedians understand what Wikiversity is for. Some real-world classes use Wikiversity for article projects that later get transwiki'd to Wikipedia. --Abd (talk) 02:50, 24 June 2014 (UTC)
You should seek a whitelisting for the relevant "about" page at the wikipedia of interest, and possibly discuss a broader whitelisting. That a wiki has a rule about a url, is a local rule which stewards will pay heed, though will never redeem the issues of the spamming. That said, a successful whitelisting at a wiki with no corresponding spam issues provides an evidence base for the removal from a blacklist.  — billinghurst sDrewth 11:19, 24 June 2014 (UTC)

Note that this was blacklisted in 2009, removed afterwards, and re-blacklisted in 2010 .. because .. the spamming continued. Nonetheless, I would not be completely against removing this and try again.

Regarding the sourcing - it would be primary source for data, generally it is better to find secondary sourcing stating the same. I know that that is not always acceptable or possible, but since this will, likely, only affect a couple of pages per wiki, I would consider to whitelist those where there is nothing else. Obviously, there is nothing against the local whitelisting of, e.g., an index.htm or the about page (we do not generally whitelist the raw-url, the regex becomes complex and/or editors would be able to spam/abuse the base-url again). --Dirk Beetstra T C (en: U, T) 08:31, 25 June 2014 (UTC)

Removed Removed  — billinghurst sDrewth 12:34, 23 July 2014 (UTC)

pro-d.ru[edit]



fall.pro-d.ru is a fan site of the game The Fall: Last Days of Gaia, there are no other sites in this 2nd level domain. The site got blacklisted because of the way too broad regex \bpro-(?!(goroda|speleo)).*?\.ru\b (originally \bpro-*?\.ru\b) which blocks all domains in .ru zone that start with "pro-". Effectively it blocks every Russian site that has "professional" or "about" it its name.

We can look to making changes. That said, not sure that fan sites are welcome in the Wikipedias. I know that enWP specifically excludes them. It would be good to have some feedback from ruWP about the proposal to remove, as, from memory, it was a problematic spam time.  — billinghurst sDrewth 08:38, 11 July 2014 (UTC)

Troubleshooting and problems[edit]

Symbol comment vote.svg This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is not the section to request that an entry be unlisted (see Proposed removals above).

t.co incorrectly blocking[edit]



t.co is a url shortner, so blocking this is necessary. But, some areas using .co as a second level domain are involved in this blacklist. For example, Japanese company Kinki Nippon Tourist Individual Tour Sales Co., Ltd. (近畿日本ツーリスト個人旅行販売) has a domain www.knt-t.co.jp , but this cannnot be linked now.--Jkr2255 (talk) 03:41, 25 February 2014 (UTC)

Status:    Done
I was able to make the regex more specific for the shortener  — billinghurst sDrewth 09:28, 25 February 2014 (UTC)

User:Billinghurst - this needs to be done differently, as t.co was now linkable: see diff. I have undone this adaptation and returned to \bt\.co\b for now, please adapt it to something that does solve the problem. --Dirk Beetstra T C (en: U, T) 16:19, 27 April 2014 (UTC)

https://www.mediawiki.org/wiki/Extension:SpamBlacklist#Usage is obviously wrong. t.co has been added 70 times since the change of the rule, the ones I checked typical redirects which should have been blocked. --Dirk Beetstra T C (en: U, T) 16:29, 27 April 2014 (UTC)

bugzilla:64541  — billinghurst sDrewth 11:08, 28 April 2014 (UTC)
The bugzilla suggested (?<!-)\bt\.co\b but this did not prevent the addition of twitter links.  — billinghurst sDrewth 13:10, 23 July 2014 (UTC)

kochi.com incorrectly blocking[edit]



Just like above discussion. Sites related to Kochi prefecture or Kochi city (in Japan) sometimes use ***-kochi.com, which cannot be linked.--Jkr2255 (talk) 12:07, 21 March 2014 (UTC)

should be Yes check.svg Done, please test  — billinghurst sDrewth 15:40, 21 March 2014 (UTC)

Removed Removed closing as done  — billinghurst sDrewth 15:41, 8 June 2014 (UTC)

SBHandler broken[edit]

SBHandler seems to be broken - both Glaisher and I had problems that it stops after the closing of the thread on this page, but before the actual blacklisting. Do we have someone knowledgeable who can look into why this does not work? --Dirk Beetstra T C (en: U, T) 04:08, 30 April 2014 (UTC)

User:Erwin - pinging you as the developer. --Dirk Beetstra T C (en: U, T) 04:16, 30 April 2014 (UTC)

FYI when you created this section with the name "SBHandler", you prevented SBHandler from being loaded at all (see MediaWiki:Gadget-SBHandler.js "Guard against double inclusions"). Of course, changing the heading won't fix the original issue you mentioned. But at least it will load now. PiRSquared17 (talk) 15:30, 18 June 2014 (UTC)

Discussion[edit]

Symbol comment vote.svg This section is for discussion of Spam blacklist issues among other users.

COIBot / LiWa3[edit]

I am busy slowly restarting COIBot and LiWa3 again - both will operate from fresh tables (LiWa3 started yesterday, 29/12/2013; COIBot started today, 30/12/2013). As I am revamping some of the tables, and they need to be regenerated (e.g. the user auto-whitelist-tables need to be filled, blacklist-data for all the monitored wikis), expect data to be off, and some functionality may not be operational yet. LiWa3 starts from an empty table, which also means that autodetection based on statistics will be skewed. I am unfortunately not able to resurrect the old data, that will need to be done by hand). Hopefully things will be normal again in a couple of days. --Dirk Beetstra T C (en: U, T) 17:26, 30 December 2013 (UTC)

Change in functionality of spam blacklist[edit]

Due to issues with determining the content of parsed pages ahead of time (see bugzilla:15582 for some examples), the way the spam blacklist works should probably be changed. Per bugzilla:16326, I plan to submit a patch for the spam blacklist extension that causes it to either delink or remove blacklisted links upon parsing, or replace them with a link to a special page explaining the blacklisting. This could be done either in addition to or instead of the current functionality. Are there any comments or suggestions on such a new implementation? Jackmcbarn (talk) 20:45, 3 March 2014 (UTC)

Hi!
I suggest, not to replace the current functionality, and will give an example for this:
In local wikis like w:de, we sometimes have the situation that we want to prevent people from using certain a domain like "seth-enterprises.example.org" everywhere in article namespace with exception of just one article (the one about the institution, e.g. "seth enterprises"). So in this case we remove all links to that domain from w:de, but we place a link to the domain in one article. Afterwards we blacklist the domain, such that nobody can add the link somewhere. In the certain article the link should still work.
Could we cope with this scenario, if the SBL functionality was changed? -- seth (talk) 15:25, 15 June 2014 (UTC)
@Jackmcbarn: I think that would break legitimate links on a wiki (sometimes a site is used minimally in a good way, e.g. in references, but massively spammed and abused further. It then gets blacklisted.
@Lustiger Seth: such links are better of specifically whitelisted. On en.wikipedia, we would whitelist the landing page ('seth-enterprises.example.org/index.htm') or the about-page (often the index.htm is 'invisible', forcing us to, in principle, whitelist the domain only, and that would open up the abuse possibility again if the problem was the linking of the domain only). In rare cases, we would whitelist the domain only. De-blacklisting, linking, and re-blacklisting is not a real solution - there are edit-scenarios where the only solution for repair is to de-blacklist again, repair, and re-blacklist. For an uninterupted edit-experience, it is better that for all blacklisted links a whitelisting solution is found. --Dirk Beetstra T C (en: U, T) 03:28, 19 June 2014 (UTC)
Hi!
White listing does not help in many of the mentioned cases, because the url of the spammers can be the same as the url that is needed in an article. If there is a better soulution, plese tell me. The edit filter could of course be used for a combination of a link-block with a specific article exception. But we try to not use the edit filter for performance reasons (if we would not do this, the edit filter would not work properly). -- seth (talk) 09:54, 19 June 2014 (UTC)
whitelisting of the type of 'http://seth-enterprises.example.org/index.htm' has on en.wiki never resulted in problems, and 'http://seth-enterprises.example.org/about.htm' neither. In fact, heavily abused websites have their index.htm's and/or about.htm's whitelisted, and are still not abused. --Dirk Beetstra T C (en: U, T) 10:51, 19 June 2014 (UTC)
We would of course not whitelist 'http://seth-enterprises.example.org' - that would open up everything, and have an end-of-string delimiter also does not help, as the main-domain is generally what is abused. --Dirk Beetstra T C (en: U, T) 11:14, 19 June 2014 (UTC)