Talk:Spam blacklist
From Meta, a Wikimedia project coordination wiki
| ←Requests and proposals | Spam blacklist | Archives (current)→ |
The associated page is used by the MediaWiki Spam Blacklist extension, and lists strings of text that may not be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any meta administrator can edit the spam blacklist. There is also a more aggressive way to block spamming through direct use of $wgSpamRegex. Only system administrators can make changes to $wgSpamRegex, and its use is to be avoided whenever possible. For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.
Please sign your posts with ~~~~ after your comment. This leaves a signature and timestamp so conversations are easier to follow. Completed requests are marked as {{added}}/{{removed}} or {{declined}}, and are generally archived (search) quickly. Additions and removals are logged.
|
Information
Tools
- Spam blacklist
- Title blacklist
- Vandalism reports
- Inactive wikis
- Interwiki map
Requests
- snippet for logging
- {{sbl-log|1733786#{{subst:anchorencode:SectionNameHere}}}}
Contents |
[edit] Proposed additions
| This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users on multiple wikis. Completed requests will be marked as {{added}} or {{declined}} and archived. |
[edit] co.cc again
co.cc
(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)
(DomainTools: whois | AboutUs | Malware? | Alexa | OnSameHost | WhosOnMyServer)
- (Reports: LinkReport ? | XWiki ? | Local | en | spam archives | find entry)
MZMcBride removed this entry. Wikipedia claims that the domain is not a real TLD and is used for URL redirectors. On that basis, I think it should be re-added with a preceeding dot: \.co\.cc\b — Mike.lifeguard | @en.wb 19:43, 11 November 2009 (UTC)
- Having had to deal with a lot of the spam links, I strongly endorse restoring this link. No offense, MZM, but this one should have been discussed first before removal. --Ckatz 10:05, 13 November 2009 (UTC)
- It's clearly not just being used for URL redirection. As the Wikipedia article notes, it can be used as a real DNS. .com is capable of URL redirection and brings in a lot more spam. I don't think blacklisting an entire TLD or ccTLD (real or not) is a good idea, though I can understand it's the simplest solution. Is there a complementary global whitelist? Do we have any idea how many false positives this addition to the blacklist will cause? --MZMcBride 10:42, 13 November 2009 (UTC)
- Do you have any evidence supporting your reason for removal? Discussion when working with others is critical, and your flippant response to my query is worrying.
- As to the substantive issue: Yes, there will be candidates for whitelisting, that was acknowledged and addressed from the initial request for blacklisting. I haven't seen that the rate is unacceptable, which you simply take as a premise, and we have helped users to request whitelisting where necessary, and will continue to do so. — Mike.lifeguard | @en.wb 14:56, 13 November 2009 (UTC)
- Flippant? You've globally blacklisted an entire ccTLD, which has broad implications on 700+ projects, plus an unknown number of sites that also use this list. This entry in particular is creating an unknown (and possibly high) number of false positives (I'm only here because there was a local problem at en.wiki regarding what appears to be an entirely valid URL and it was baffling how the URL could be blacklisted). Here's the diff of you broadening the regex—where was the discussion for doing this? I don't see anything in the log, though admittedly the log is nearly impossible to navigate. (If there is no discussion, what was the rationale? Is there supporting data to suggest that the only possible approach here is to block the entire ccTLD, an obviously extreme tactic?) --MZMcBride 16:33, 13 November 2009 (UTC)
- I think you missed this. — Mike.lifeguard | @en.wb 16:36, 13 November 2009 (UTC)
- Are discussions on this talk page archived anywhere? I checked the log (silly me, I know). Reading the old discussion, I'm still baffled about the rationale here. It can be used for URL redirection. So can literally any other domain (top-level or otherwise). That's not an argument to ban any and all uses of it. If there's evidence that this domain is unmanageable and won't result in an excessive number of false positives, I don't have an issue with including such a broad regex. But I'd like there to be some specific data to point to, not just "can be used for URL redirection," which I consider a non-argument. --MZMcBride 16:42, 13 November 2009 (UTC)
- Not "can" -- "is" (well, "was" until you removed it :D). You can see User:COIBot/XWiki/co.cc for a small taste (too many results to generate the large taste) - or the original request. Anecdotally, yes, we know it was abused cross-wiki; that's why I added it when JzG brought the request here - if not it would have been "add to XLinkBot for enwiki, and we'll attempt to monitor on other wikis with COIBot. — Mike.lifeguard | @en.wb 16:50, 13 November 2009 (UTC)
- Are discussions on this talk page archived anywhere? I checked the log (silly me, I know). Reading the old discussion, I'm still baffled about the rationale here. It can be used for URL redirection. So can literally any other domain (top-level or otherwise). That's not an argument to ban any and all uses of it. If there's evidence that this domain is unmanageable and won't result in an excessive number of false positives, I don't have an issue with including such a broad regex. But I'd like there to be some specific data to point to, not just "can be used for URL redirection," which I consider a non-argument. --MZMcBride 16:42, 13 November 2009 (UTC)
- I think you missed this. — Mike.lifeguard | @en.wb 16:36, 13 November 2009 (UTC)
- Flippant? You've globally blacklisted an entire ccTLD, which has broad implications on 700+ projects, plus an unknown number of sites that also use this list. This entry in particular is creating an unknown (and possibly high) number of false positives (I'm only here because there was a local problem at en.wiki regarding what appears to be an entirely valid URL and it was baffling how the URL could be blacklisted). Here's the diff of you broadening the regex—where was the discussion for doing this? I don't see anything in the log, though admittedly the log is nearly impossible to navigate. (If there is no discussion, what was the rationale? Is there supporting data to suggest that the only possible approach here is to block the entire ccTLD, an obviously extreme tactic?) --MZMcBride 16:33, 13 November 2009 (UTC)
- It's clearly not just being used for URL redirection. As the Wikipedia article notes, it can be used as a real DNS. .com is capable of URL redirection and brings in a lot more spam. I don't think blacklisting an entire TLD or ccTLD (real or not) is a good idea, though I can understand it's the simplest solution. Is there a complementary global whitelist? Do we have any idea how many false positives this addition to the blacklist will cause? --MZMcBride 10:42, 13 November 2009 (UTC)
(unindent) A question about User:COIBot/LinkReports/co.cc. How is the false positive ratio determined? It looks like the bot finds all instances of the domain (or part of a domain string) being added to a page, but are there are numbers regarding how many of these additions were legitimate? (There are legitimate uses of this ccTLD, right?) --MZMcBride 09:57, 15 November 2009 (UTC)
[edit] Proposed additions (Bot reported)
| This section is for domains which have been added to multiple wikis as observed by a bot.
These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see Spam blacklist/Help#COIBot_reports. Reports will automatically be archived by the bot when they get stale (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is COIBot).
|
[edit] COIBot
The LinkWatchers report domains meeting the following criteria:
- When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
- When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
- If ALL links are added by IPs, and the link is added to more than 1 wiki
- If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.
|
COIBot's currently open XWiki reports
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
[edit] Proposed removals
| This section is for proposing that a website be unlisted; please add new entries at the bottom of the section.
Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{removed}} or {{declined}} and archived. See also /recurring requests for repeatedly proposed (and refused) removals. The addition or removal of a domain from the blacklist is not a vote; please do not bold the first words in statements. |
[edit] france-voyage.com
france-voyage.com
(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)
(DomainTools: whois | AboutUs | Malware? | Alexa | OnSameHost | WhosOnMyServer)
- (Reports: LinkReport ? | XWiki ? | Local | en | spam archives | find entry)
Following a request to the French admin team (http://fr.wikipedia.org/wiki/Wikip%C3%A9dia:RA), I forward this request to the Meta admin team as advised. This site contains a good number of articles and I wanted to reference one of them for a contribution on Bouliac (http://fr.wikipedia.org/wiki/Bouliac) Marcipo 14:34, 19 November 2009 (UTC)
- This looks like genuin crosswikispam to me. Suggest local whitelisting at fr: if the site is considered relevant there, but would advise against removal from Global BL. Finn Rindahl 00:18, 20 November 2009 (UTC)
[edit] cigarinspector.com
cigarinspector.com
(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)
(DomainTools: whois | AboutUs | Malware? | Alexa | OnSameHost | WhosOnMyServer)
- (Reports: LinkReport ? | XWiki | Local | en | spam archives | find entry)
cigarinspector.com in articles like http://en.wikipedia.org/wiki/Punch_%28cigar_brand%29 as it is the leading English-speaking Cuban cigar reviews blog —The preceding unsigned comment was added by 81.185.118.211 (talk • contribs) (diff) —Dferg (disputatio) 13:18, 17 November 2009 (UTC).
- The domain is not blacklisted on meta. It is on the English language Wikipedia, you need to ask there for removal.
Deferred to en:MediaWiki talk:Spam-blacklist. Thank you, —Dferg (disputatio) 13:20, 17 November 2009 (UTC)
[edit] fanhistory.com
fanhistory.com
(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)
(DomainTools: whois | AboutUs | Malware? | Alexa | OnSameHost | WhosOnMyServer)
- (Reports: LinkReport ? | XWiki | Local | en | spam archives | find entry)
Owner of site is looking to doing a proposal to be hosted by Wikimedia. Has also given her personal guarantee that past behavior of linkspamming will not recur. bastique demandez! 22:54, 19 November 2009 (UTC)
- Will also make having the Fan History blog be able to be in the wiki open planet blog aggregated - which would be good, because a lot of the content is wiki related. ~~ MarkDilley
[edit] Troubleshooting and problems
[edit] Discussion
| This section is for discussion of Spam blacklist issues among other users. |
[edit] wmf4.me / enwn.net
Ran into Mike_lifeguard during a #wikimedia-strategy, he asked me to pop an email off to info-en-l explaining what wmf4.me is and why it shouldn't be black listed. Who then told me to post it here. Hopefully the rat has found it's cheese...
So first, domain: http://wmf4.me/ ( http://enwn.net/ is an alias, the original name, in the process of changing all the titles).
Short and sweet version: It is a URL Shortening service (w:URL Shortening http://wmf4.me/42EEb ). Anyone can create a shortened link, but it is somewhat user un-friendly at the moment. There are only 3 methonds. #1 is by the automated RSS->Twitter proccess, #2 is the bookmark ( http://wmf4.me/bookmark.php ), #3 is a gadget we made for English Wikinews ( mentioned here: http://enwn.net/5e231 ). Method #4, the venerable "web form" is in development.
Most importantly, it is designed for Foundation sites only. It is setup with a white list of domains that are allowed to be shortened (IE: Wikipedia.org, Wikinews.org, Mediawiki.org, etc). Anything outside of the foundation URLs, the Shortener will throw an error on. --ShakataGaNai ^_^ 18:36, 20 November 2009 (UTC)