Jump to content

Talk:Spam blacklist

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by A. B. (talk | contribs) at 12:45, 31 March 2008 (→‎www.beginner-sql-tutorial.com/sql.htm: more blacklisting). It may differ significantly from the current version.

Latest comment: 16 years ago by A. B. in topic Proposed removals
Shortcut:
WM:SPAM
The associated page is used by the Mediawiki Spam Blacklist extension, and lists strings of text that may not be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any meta administrator can edit the spam blacklist. There is also a more aggressive way to block spamming through direct use of $wgSpamRegex. Only developers can make changes to $wgSpamRegex, and its use is to be avoided whenever possible.

For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.

Please post comments to the appropriate section below: Proposed additions, Proposed removals, or Troubleshooting and problems, read the messageboxes at the top of each section for an explanation. Also, please check back some time after submitting, there could be questions regarding your request. Per-project whitelists are discussed at MediaWiki talk:Spam-whitelist. In addition to that, please sign your posts with ~~~~ after your comment. Other discussions related to this last, but that are not a problem with a particular link please see, Spam blacklist policy discussion.

Completed requests are archived (list, search), additions and removal are logged.

snippet for logging: {{/request|938895#section_name}}

If you cannot find your remark below, please do a search for the url (link) in question with this Archive Search tool.

Spam that is only affecting a single project should go to that project's local blacklist, if available: ENWP

Proposed additions

This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users. Completed requests will be marked as done or denied and archived.

beyond-the-pale.co.uk

  1. (it_wikipedia) 2008-03-21 03:20:15 -- http://www.beyond-the-pale.co.uk/albanian7.htm -- 217.24.248.34 -- diff
  2. (fr_wikipedia) 2008-03-21 04:03:12 -- http://www.beyond-the-pale.co.uk/albanian7.htm -- 217.24.248.34 -- diff
  3. (en_wikipedia) 2008-03-21 05:55:46 -- http://www.beyond-the-pale.co.uk/albanian7.htm -- 217.24.248.34 -- diff

Another case, not so bad, but the IP is only adding links. —— Eagle101 Need help? 11:23, 21 March 2008 (UTC)Reply

Its on a few wikis in small numbers. I'll get lwcoibot to generate a report :) —— Eagle101 Need help? 11:28, 21 March 2008 (UTC)Reply


Full report at User:COIBot/LinkReports/beyond-the-pale.co.uk —— Eagle101 Need help? 14:18, 21 March 2008 (UTC)Reply

Unsure - the IP contributes constructively (contribs) (well at least some stuff that is not spam!) and it hardly seems a real source of disruption at present. The pages spammed are very specific so it is as much POV as anything else? --Herby talk thyme 14:41, 21 March 2008 (UTC)Reply

Sure, however please do realize that as of 6 months ago it was starting to be a common tactic, especially the die hard ones, for spammers to do some minor edits that don't add links. I've seen this tactic used several times on the English wikipedia especially. We can delay blacklisting if you wish. —— Eagle101 Need help? 16:38, 21 March 2008 (UTC)Reply

margencero.com

  1. (es_wikipedia) 2008-03-20 23:01:46 -- http://www.margencero.com/articulos/new/modernidad_liquida.html -- 190.164.252.211 -- diff
  2. (en_wikipedia) 2008-03-20 23:05:51 -- http://www.margencero.com/articulos/new/modernidad_liquida.html -- 190.164.252.211 -- diff
  3. (pt_wikipedia) 2008-03-20 23:11:11 -- http://www.margencero.com/articulos/new/modernidad_liquida.html -- 190.164.252.211 -- diff

Another case. Your thoughts? —— Eagle101 Need help? 11:24, 21 March 2008 (UTC)Reply

Generating an lwciobot report. —— Eagle101 Need help? 11:30, 21 March 2008 (UTC)Reply


Not disputing this but the en wp link placement is still actually there & does not appear to be "unwanted"? --Herby talk thyme 15:15, 21 March 2008 (UTC)Reply
Interesting, those were added a long time ago (older then 6 months) as they are not in the reports. The cases I've looked at, using the search tool indicates that on wikis with the same article, the links were added by IPs or new users at the same time. That indicates to me that it is potentially being spammed, however I do see a few legit uses. As such I'll hold off this request for a bit and see if the problems continue or not. However if it continues to be a problem, we can always blacklist, then whitelist in locations where they want the links, but as I said, lets wait :) —— Eagle101 Need help? 16:22, 21 March 2008 (UTC)Reply

crusades1444.hit.bg

  1. (bg_wikipedia) 2008-03-21 06:21:08 -- http://crusades1444.hit.bg -- 79.100.51.232 -- diff
  2. (pl_wikipedia) 2008-03-21 06:27:02 -- http://crusades1444.hit.bg -- 79.100.51.232 -- diff
  3. (en_wikipedia) 2008-03-21 06:28:27 -- http://crusades1444.hit.bg -- 79.100.51.232 -- diff
  4. (en_wikipedia) 2008-03-21 06:30:46 -- http://crusades1444.hit.bg -- 79.100.51.232 -- diff

Some more. I'd do this one myself but I would like some extra review. This is not on any major wikipedias. —— Eagle101 Need help? 11:32, 21 March 2008 (UTC)Reply



Unless luxo is well borked the above appear to be the sole contributions of the IP ([1])? --Herby talk thyme 14:45, 21 March 2008 (UTC)Reply
Correct, and as we can't block foundation wide, I used to blacklist if spammers are persistant, and its not a joe job in any form that I can detect. —— Eagle101 Need help? 16:25, 21 March 2008 (UTC)Reply
I have a problem here - four edits are adequate "disruption" for you to even consider this? Google ing it gets a mere 21 hits. Where is the evidence that this would even be of interest to a local blacklist never mind one affecting as many wikis as a listing here would - I would certainly reject listing out of hand on en wp list. I have warned the IP on en wp with an "im" warning for what it is worth. There are still no further edits showing up on luxo for the ip. --Herby talk thyme 16:50, 21 March 2008 (UTC)Reply
Generally if it continues after we place it here I would blacklist it. If there are no further edits it is pointless to blacklist, however we don't have anywhere other then here to track crosswiki spam. But if this would say continue spamming over the next week we may wish to do something. The problem is if it continues it is very man intensive to stop, there is no global way to block an IP, if there were I'd recommend doing that should it continue. I will say that for what it has done blacklisting off hand is a bit heavy. —— Eagle101 Need help? 17:13, 21 March 2008 (UTC)Reply
I should also point out, that googling it won't help much, spammers usually do this to get more hits on search engines. —— Eagle101 Need help? 17:16, 21 March 2008 (UTC)Reply

krahufrengjisht.blogspot.com

  1. (it_wikipedia) 2008-03-21 06:13:49 -- http://krahufrengjisht.blogspot.com -- 217.24.248.34 -- diff
  2. (fr_wikipedia) 2008-03-21 07:15:48 -- http://krahufrengjisht.blogspot.com/2008/03/par-ismail-kadar-un-livre-de-messages.html -- 217.24.248.34 -- diff
  3. (de_wikipedia) 2008-03-21 08:18:46 -- http://krahufrengjisht.blogspot.com -- 217.24.248.34 -- diff

Blogspot spam, I don't know what we do with this, but its on a bunch of wikis atm, full report will show up at User:COIBot/LinkReports/krahufrengjisht.blogspot.com. —— Eagle101 Need help? 13:03, 21 March 2008 (UTC)Reply



  • I think we should be ready to locally blacklist blogspot spammers without much thought, whether it needs globally blacklisting is not clear to me here. JzG 19:54, 29 March 2008 (UTC)Reply

fracassi.net

  1. (en_wikipedia) 2008-03-21 08:34:48 -- http://www.fracassi.net/iw2evk/ -- 212.177.63.147 -- diff
  2. (it_wikipedia) 2008-03-21 08:36:31 -- http://www.fracassi.net/iw2evk/ -- 212.177.63.147 -- diff
  3. (en_wikipedia) 2008-03-21 08:37:25 -- http://www.fracassi.net/iw2evk/ -- 212.177.63.147 -- diff
  4. (en_wikipedia) 2008-03-21 08:39:32 -- http://www.fracassi.net/iw2evk/ -- 212.177.63.147 -- diff
  5. (sv_wikipedia) 2008-03-21 08:41:18 -- http://www.fracassi.net/iw2evk/ -- 212.177.63.147 -- diff

More crosswiki spam. Full report at User:COIBot/LinkReports/www.fracassi.net —— Eagle101 Need help? 13:54, 21 March 2008 (UTC)Reply



Checking the links here I would not call it significant - 19 links over 7 of the 20 wiki. Not inclined to list. --Herby talk thyme 14:00, 21 March 2008 (UTC)Reply
Alright, I'm waiting on the report by COIBot, to see if there is any more. The tools I'm using only go back about 12 hours in history. (It just started that far back). Depending on if I continue getting spam from the /iw2evk/ portion of the site, I'll blacklist that. —— Eagle101 Need help? 14:05, 21 March 2008 (UTC)Reply
Herby, I also wish to make a point, if someone reverts all the links, doing a linksearch as you are doing won't turn up much. The tool only finds links that are actually on the current version of the pages. (I was one of the authors of that tool :P) Just so ya know. :) —— Eagle101 Need help? 14:13, 21 March 2008 (UTC)Reply
I am well aware who you are! The problem is that without say a luxo link for IP or user contribs (one of the most valuable approaches for me) the information to make a sensible decision is a bit limited --Herby talk thyme 14:35, 21 March 2008 (UTC)Reply
Thats why I'm waiting on the COIbot report. Beetstra is awesome for writing that bit of genius. By the way to clarify what I"m saying Special:linksearch is a really bad way for detecting past spam. It is good to see how much a link is used though. If a link is being used 800 times, its a bad idea to blacklist unless we remove those links. I usually check the tool to make sure the usage of the link is low, if its not I try to undo as many of them as I have time to do, or take the high usage of the link as an indication that blacklisting the link is a bad idea. An example of this is bbc, or nytimes.com, etc :). Those links are used crosswiki, and are showing up a lot in my experiments at automating the detection of crosswiki spam, but those are not spam. (Most of the time commonly used sites are not that obvious, so hopefully you get my gist...) Also the reason I'm putting a lot of these up is that I've been out of the action for a long time... and I'd like to see what the current thoughts are in various cases. —— Eagle101 Need help? 14:38, 21 March 2008 (UTC)Reply
Waiting on User:COIBot/LinkReports/fracassi.net to go blue. (the bot needs to generate the report... which takes time and database queries... —— Eagle101 Need help? 14:47, 21 March 2008 (UTC)Reply
Ok, looking at that report, I personally would suggest blacklisting. I'll leave it up to you, but my rational is that the spam has been going on for a long period of time. I doubt just reverting it is going to make it stop. We have stuff from back in 2007. I'll leave the choice up to you however ;) —— Eagle101 Need help? 14:50, 21 March 2008 (UTC)Reply
Yes, I was coming here to say this URL should be blacklisted, but saw Eagle has beaten me to it. Mønobi 21:49, 21 March 2008 (UTC)Reply

xtremebubbles.com biographi.ca keithmichaeljohnson.com tangenttoy.com worldslargestbubble.com rinaldo.de

  1. (de_wikipedia) 2008-03-21 13:35:46 -- http://www.xtremebubbles.com/media_press.html -- Claus Ableiter -- diff
  2. (en_wikipedia) 2008-03-21 13:40:47 -- http://www.xtremebubbles.com/media_press.html -- Claus Ableiter -- diff
  3. (es_wikipedia) 2008-03-21 13:50:00 -- http://www.xtremebubbles.com/media_press.html -- Claus Ableiter -- diff
  4. (fr_wikipedia) 2008-03-21 13:52:25 -- http://www.xtremebubbles.com/media_press.html -- Claus Ableiter -- diff
  1. (en_wikipedia) 2008-03-20 19:47:23 -- http://www.biographi.ca/EN/ShowBio.asp?BioId=34223 -- Corvus cornix -- diff
  2. (pl_wikipedia) 2008-03-21 04:59:48 -- http://www.biographi.ca/EN/ShowBio.asp?BioId=34229 -- Mmt -- diff
  3. (pl_wikipedia) 2008-03-21 04:59:48 -- http://www.biographi.ca/EN/ShowBio.asp?BioId=34229 -- Mmt -- diff
  4. (tr_wikipedia) 2008-03-21 07:24:18 -- http://www.biographi.ca/EN/ShowBio.asp?BioId=42027 -- Dsmurat -- diff
  5. (nl_wikipedia) 2008-03-21 07:50:57 -- http://www.biographi.ca/EN/ShowBio.asp?BioId=34124 -- Agora -- diff
  6. (nl_wikipedia) 2008-03-21 07:50:57 -- http://www.biographi.ca/EN/ShowBio.asp?BioId=34621&query=John%20AND%20Rhoades -- Agora -- diff
  7. (nl_wikipedia) 2008-03-21 08:11:46 -- http://www.biographi.ca/EN/ShowBio.asp?BioId=34124 -- Agora -- diff
  1. (de_wikipedia) 2008-03-21 12:49:23 -- http://www.keithmichaeljohnson.com/ -- Claus Ableiter -- diff
  2. (en_wikipedia) 2008-03-21 13:09:55 -- http://www.keithmichaeljohnson.com/ -- Claus Ableiter -- diff
  3. (es_wikipedia) 2008-03-21 13:49:59 -- http://www.keithmichaeljohnson.com/ -- Claus Ableiter -- diff
  4. (fr_wikipedia) 2008-03-21 13:52:25 -- http://www.keithmichaeljohnson.com/ -- Claus Ableiter -- diff
  1. (de_wikipedia) 2008-03-21 12:54:59 -- http://www.tangenttoy.com/bubbleman/ -- Claus Ableiter -- diff
  2. (en_wikipedia) 2008-03-21 13:09:55 -- http://www.tangenttoy.com/bubbleman/ -- Claus Ableiter -- diff
  3. (es_wikipedia) 2008-03-21 13:50:00 -- http://www.tangenttoy.com/bubbleman/ -- Claus Ableiter -- diff
  4. (fr_wikipedia) 2008-03-21 13:52:25 -- http://www.tangenttoy.com/bubbleman/ -- Claus Ableiter -- diff
  1. (de_wikipedia) 2008-03-21 12:41:29 -- http://www.tomnoddy.com/ -- Claus Ableiter -- diff
  2. (en_wikipedia) 2008-03-21 13:09:55 -- http://www.tomnoddy.com/ -- Claus Ableiter -- diff
  3. (es_wikipedia) 2008-03-21 13:49:59 -- http://www.tomnoddy.com/ -- Claus Ableiter -- diff
  4. (fr_wikipedia) 2008-03-21 13:52:25 -- http://www.tomnoddy.com/ -- Claus Ableiter -- diff
  1. (de_wikipedia) 2008-03-21 13:30:55 -- http://worldslargestbubble.com/gwr.html -- Claus Ableiter -- diff
  2. (en_wikipedia) 2008-03-21 13:40:47 -- http://worldslargestbubble.com/gwr.html -- Claus Ableiter -- diff
  3. (es_wikipedia) 2008-03-21 13:50:00 -- http://worldslargestbubble.com/gwr.html -- Claus Ableiter -- diff
  4. (fr_wikipedia) 2008-03-21 13:52:25 -- http://worldslargestbubble.com/gwr.html -- Claus Ableiter -- diff
  1. (de_wikipedia) 2008-03-21 12:58:17 -- http://www.rinaldo.de/seifenblasen-show.php -- Claus Ableiter -- diff
  2. (en_wikipedia) 2008-03-21 13:09:55 -- http://www.rinaldo.de/seifenblasen-show.php -- Claus Ableiter -- diff
  3. (es_wikipedia) 2008-03-21 13:50:00 -- http://www.rinaldo.de/seifenblasen-show.php -- Claus Ableiter -- diff
  4. (fr_wikipedia) 2008-03-21 13:52:25 -- http://www.rinaldo.de/seifenblasen-show.php -- Claus Ableiter -- diff

Listing all these, I'm reverting them now, but we need to keep an eye on this one. All these links were added in one edit, so there are only 4 reverts to be made here. —— Eagle101 Need help? 19:15, 21 March 2008 (UTC)Reply

This may be legit, I'm going to ask the guy. —— Eagle101 Need help? 19:18, 21 March 2008 (UTC)Reply
I was also wrong with the one edit thing, they have been added in piecemeal. :S —— Eagle101 Need help? 19:24, 21 March 2008 (UTC)Reply
Ok, confirmed link additions to 9 other wikis... User:COIBot/LinkReports/rinaldo.de, User:COIBot/LinkReports/xtremebubbles.com, User:COIBot/LinkReports/worldslargestbubble.com, User:COIBot/LinkReports/tomnoddy.com, User:COIBot/LinkReports/tangenttoy.com, User:COIBot/LinkReports/keithmichaeljohnson.com, User:COIBot/LinkReports/bubbleart.com. Please opinions on this. Is this legit? —— Eagle101 Need help? 22:40, 21 March 2008 (UTC)Reply
Iffy definitely. If it is the one user only then the first thing to do I would think is to have a word with them. Taking a look on de they seem legit? If that does not work or it starts turning up from IPs then it definitely should be looked at again. --Herby talk thyme 08:31, 22 March 2008 (UTC)Reply
I've already asked the user, if there is no response shortly, (he has edited enwiki again already, so he saw the new messages banner). I'm going to revert the links, and otherwise attempt to clean them up. The sites are english only, so why they are being added to other locations is beyond me. They seem *somewhat* legit, otherwise I would have removed them. —— Eagle101 Need help? 10:07, 22 March 2008 (UTC)Reply
Definitely worth watching. I see a link added to fr wp (tourisme.alsace-bossue.net & just the one). However seems a legit Commons user with a fair few contributions as far as I can see --Herby talk thyme 10:28, 22 March 2008 (UTC)Reply

automarkhistory.com

This is ongoing. The statistics say that one user (user:92.113.25.55) has added this link to 22 wikis (66 link additions, all 66 in database by this user).



(see the (upcoming) COIBot report). --Beetstra 09:52, 26 March 2008 (UTC)Reply

I support the request - see the Dutch talk page: here MoiraMoira 09:56, 26 March 2008 (UTC)Reply

Added Added will finish later, thanks both --Herby talk thyme 10:05, 26 March 2008 (UTC)Reply

Dear community, My site automarkhistory.com is the automobile encyclopedia and is called to unite owners of ancient cars worldwide. Unblock please

Best Regards Dmitry Myasnikov— The preceding unsigned comment was added by Richi (talk)


Dmitry, typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their encyclopaedic value in support of our encyclopaedia pages. If such an editor asks to use your links, I'm sure the request will be carefully considered and your links may well be removed.
This blacklist is used by more than just our 700+ Wikimedia Foundation wikis (Wikipedias, Wiktionaries, etc.). All 3000+ Wikia wikis plus a substantial percentage of the 25,000+ unrelated wikis that run on our MediaWiki software have chosen to incorporate this blacklist in their own spam filtering. Each wiki has a local "whitelist" which overrides the global blacklist for that project only. Some of these non-Wikimedia sites may be interested in your links; by all means feel free to request local whitelisting on those.
Unlike Wikipedia, DMOZ is a web directory specifically designed to categorize and list all Internet sites; if you've not already gotten your sites listed there, I encourage you to do so -- it's a more appropriate venue for your links than our wikis. Their web address: http://www.dmoz.org/. --A. B. (talk) 01:19, 27 March 2008 (UTC)Reply

baccara-forever.de

And another cross-wiki spammer:



is active at the moment. --Beetstra 14:37, 26 March 2008 (UTC)Reply

Added Added thanks --Herby talk thyme 14:45, 26 March 2008 (UTC)Reply

www.seguente.com/

And another cross-wiki spammer for a textile company done via dynamic turkish IP on many wikiedia's yesterday and today - see for file here. Is active at the moment. MoiraMoira 08:35, 27 March 2008 (UTC) IP used (may be related to the one below) was 85.100.170.55Reply

Good catch - won't be as active now :) Added Added & thanks --Herby talk thyme 08:41, 27 March 2008 (UTC)Reply

bursahalter.com

Cross wiki. Was added by one of the IPs that inserted seguente.com links too (See: de.wikipedia contributions of 85.107.139.114.)




IPs inserting the link:

--Jorunn 09:48, 27 March 2008 (UTC)Reply

tr.wikipedia user inserting both bursahalter.com and seguente.com links: tr.wikipedia contributions Volwerine
--Jorunn 10:06, 27 March 2008 (UTC)Reply

Thanks Jorunn - Done --Herby talk thyme 10:35, 27 March 2008 (UTC)Reply

www.provenmodels.com and www.vizads.com

Cross wiki spam for model site. See the various IP addresses from India from which the url was added over and over again on various Wikipeda's here. Kind regards, MoiraMoira 11:33, 27 March 2008 (UTC)Reply





Luxo gives me just two edits? Anything more available? --Herby talk thyme 11:44, 27 March 2008 (UTC)Reply
And looking at this suggests other domains are involved? Any en wp folk reporting? --Herby talk thyme 11:51, 27 March 2008 (UTC)Reply
I just asked our tech wiz RonaldB on Wiki-nl and he reported back thsese are all rather dynamic IP-addies from India (so randomly allocated with each log in session from a huge pool) so that explains each time a new 117.xxx.xxx.xxx number is found spamming. So a block for the site is needed otherwise every where the same spam pops up over and over again from various IP-adresses and we have tosemiprotect aricles. Kind regards, MoiraMoira 13:50, 27 March 2008 (UTC) (mod on wiki-nl)Reply

Fair enough & Added Added. I think I was expect an en wp req with some more domains based on the activity there. Thanks --Herby talk thyme 13:53, 27 March 2008 (UTC)Reply

Added the next one - vizads.com as well now above - also added by some of the dynamic IP-addies MoiraMoira 13:58, 27 March 2008 (UTC)Reply
Ok Added Added.
I do "know" who you are & appreciate the help MoiraMoira (check who "welcomed" you :))! Feel free to add ones but can I ask that you do not change the headers for now. Logging is playing up and I am keeping an off line record of the ones I'm doing at present - I would hate to make more than my usual share of mistakes! If I can help I will - regards --Herby talk thyme 14:10, 27 March 2008 (UTC)Reply


Same adsense as vizads.com. Spammed to French Wikipedia: [2] [3]. See w:WT:WPSPAM#vizads.com myclassifiedads.net provenmodels.com. 58.170.172.215 03:11, 28 March 2008 (UTC)Reply

Sorry - missed it earlier - Done --Herby talk thyme 15:18, 28 March 2008 (UTC)Reply

kizilsungurdefender.jimdo.com/



Caught in the act now - crossiwikispamming right now via dynamic turkish IP address. Spamspysite pretending to be a computer protecting program. MoiraMoira 15:20, 27 March 2008 (UTC)Reply

Added Added by Herby a while ago [4]VasilievVV 17:12, 27 March 2008 (UTC)Reply
Got distracted - apologies & thanks --Herby talk thyme 17:14, 27 March 2008 (UTC)Reply

Proxy lists

  • www.bind.com Myspace Proxy Server]
  • www.opencity.us Anonymous proxy For Schools]
  • free-proxy.org.ua Free Proxy list. Daily Updated. HTTP, Socks]
  • geexzone.free.fr/ Free WebProxy]
  • www.trproxy.net Proxy]

I found these when clearing out a list of bot-reported spam links. I see a lot of lists like this on proxy articles, all of which are SPAMHOLE candidates. Should we consider blocking sites that are just lists of proxies? JzG 19:24, 27 March 2008 (UTC)Reply

Was that crosswiki spam? — VasilievVV 04:57, 28 March 2008 (UTC)Reply
Linked on several wikis, yes, but I deleted them. Thing is, proxy lists have no obvious valid use on Wikipedias. JzG 23:01, 30 March 2008 (UTC)Reply

www.josefov.com/ and pevnostjosefov.wz.cz/

Commercial and sponsored tourist sites placed by one IP-address from Jaromer CZ, 84.244.94.234, currently on 6 Wikipedia versions repeatedly on 25, 27 and 28 march. Now active again.





Kind regards, MoiraMoira 08:12, 28 March 2008 (UTC)Reply

Thanks & Added Added --Herby talk thyme 08:56, 28 March 2008 (UTC)Reply

www.worldmapfinder.com.

Just wander over to User:SpamReportBot/cw/www.worldmapfinder.com. The list is huge, over 150 links added. Any idea of what is going on? Seems like a bunch of IPs and a guy named "WorldMapFinder" on ko wikipedia has added a bunch of links... Is this legit? My suspicion is its not, but I could be missing something :S —— Eagle101 Need help? 21:52, 28 March 2008 (UTC)Reply

Looks suss to me at a quick glance but not likely to be on enough today to deal with anything. --Herby talk thyme 08:11, 29 March 2008 (UTC)Reply
It is also in the automatically reported links. Please help with those, as I work on making that report more accurate. Blacklist the good stuff, make a note of the clearly bad stuff. I'm still working on methods to make this accurate. When I get a finalized algorithm I'll post that in the header. This bot replaces COIBot for this function. —— Eagle101 Need help? 21:59, 29 March 2008 (UTC)Reply
Done --A. B. (talk) 03:29, 30 March 2008 (UTC)Reply

pacino.narod.ru/

fansite cross wiki linkspam (russian site, via russian IP-address 217.170.91.125) on many wikipedia's started jan 2008, repeated in feb 2008.



Thanx, Kind regards, MoiraMoira 17:35, 30 March 2008 (UTC)Reply

Agreed - quite a set of "contributions". Added Added & thanks --Herby talk thyme 17:44, 30 March 2008 (UTC)Reply

www.ip-adress.com

Continued linkspamming from various IP-adresses (see here for years wiki-wide for commercial site.



Kind regards, MoiraMoira 17:48, 30 March 2008 (UTC)Reply

Not done I'm sorry, at this time I'm unable to see any additions to other foundation projects. If this link addition is a local problem (to nl wikipedia), then please add this link to your local blacklist. nl:Mediawiki:Spam-blacklist would be where to add the link. However you will need to find an administrator to add the link. —— nixeagle 23:34, 30 March 2008 (UTC)Reply
This link has been spammed cross-wiki; here are just a sample based on the page of IP addresses cited above:


Here are some related domains; I found no links to any of them on any projects:
























Google Adsense ID: 1452210452390883
I'm not sure this one is worth blacklisting given the number of innocently placed links.
--A. B. (talk) 02:15, 31 March 2008 (UTC)Reply

Proposed additions (Bot reported)

This section is for websites which have been added to multiple wikis as observed by a bot. It transcludes User:SpamReportBot/cw.

Items here will automatically be archived by the bot when they get stale.

Sysops, please note if things are 'done', 'added' or the opposite in the various sections below. The bot will soon base removal of items based on those tags. If it is the opposite please place why so I can tweak the algorithm. There will be an 'ignore' list for items that crop up repeatedly, but hopefully it will not be used as ideally we should have 0 false hits.

These are automated reports, please check the records and the link thoroughly, it may be good links!

Please place suggestions on the automated reports in the discussion section.

List
User:SpamReportBot/cw/nakedafrica.net
User:SpamReportBot/cw/hnl-statistika.com
User:SpamReportBot/cw/therasmus-hellofasite.it
User:SpamReportBot/cw/prolococusanese.interfree.it
User:SpamReportBot/cw/rprece.interfree.it

Proposed removals

This section is for proposing that a website be unlisted; please add new entries at the bottom of the section. Remember to provide the specific URL blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as done or denied and archived. See also /recurring requests for repeatedly proposed (and refused) removals. The addition or removal of a link is not a vote, please do not bold the first words in statements.

islamhouse.com

I'm not familiar with the rest of their site, but they have a translation of the Qu'ran using the N'Ko alphabet that would be a useful addition to the External Links section of the article on Wikipedia:N'Ko. Can it be removed? --Wikipedia:User:SteveFoerster 72.83.183.30 15:48, 24 March 2008 (UTC)Reply

There was quite a bit of link placement from here (see here). Equally I guess going to en wp whitelist (w:MediaWiki talk:Spam-whitelist) is maybe the best thing to do. The question I would ask would be "are there reliable alternatives?" for what it's worth. Thanks --Herby talk thyme 14:06, 25 March 2008 (UTC)Reply

lyrikline.org

I found this sit blocked at the german wiki (wherein it has as projekt an own entrie!). it won the grimme-online-award for best website in the category culture, an is indeed the best source for lyrik i know webwide. it is blocked on meta-level, so I have to place my request for unblocking in here. would be greatful if you could unblock this site (I'm not the only user asking why this site is blocked). all the best 85.178.228.96 09:31, 27 March 2008 (UTC)Reply

Cross wiki link placement in January (here) led to this. You could seek whitelisting on de wiki I guess --Herby talk thyme 09:36, 27 March 2008 (UTC)Reply

idiotikon.ch

This site is blocked at the German wiki's articles "Schweizerdeutsch" and "Wörterbuch der schweizerdeutschen Sprache", at the English wiki's article "Swiss German", and at other wikis. But the site is absolutely correct and should not be blacklisted; it is the new site of the Swiss National Dictionary and replaces the old one, namely sagw.ch/dt/Kommissionen/woerterbuch/. So please remove it from being banned and give it free. Thank you. 81.221.153.214 18:15, 29 March 2008 (UTC)Reply

I'll remove it as soon as I get some reliable input from some German-speaking users — VasilievVV 12:28, 31 March 2008 (UTC)Reply

www.beginner-sql-tutorial.com/sql.htm

I found this site was blocked because this site earlier had some problems. Now I don't see any problems with this site. And more over this site has some good tutorial material on SQL programming like sql integrity constaints, sql joins, sql subquery etc. So I would request you to test the above link again and remove this site from blacklisting. Thank You. --TuiTapak202.62.80.3 07:18, 31 March 2008 (UTC)Reply

beginner-sql-tutorial.com, is a known site that has previously transmitted exploit code[5].
  • Adsense pub-9756582730476191




Accounts




































See WikiProject Spam Item

--Hu12 08:57, 31 March 2008 (UTC)Reply


Related domain:


Neither java-certification-tutorial.com nor plsql-tutorial.com are currently blacklisted; I will blacklist them now.
Response to site owner:
  • This blacklist is used by more than just our 700+ Wikimedia Foundation wikis (Wikipedias, Wiktionaries, etc.). All 3000+ Wikia wikis plus a substantial percentage of the 25,000+ unrelated wikis that run on our MediaWiki software have chosen to incorporate this blacklist in their own spam filtering. Each wiki has a local "whitelist" which overrides the global blacklist for that project only. Some of these non-Wikimedia sites may be interested in your links; by all means feel free to request local whitelisting on those. In any event, I do not foresee any chance of your links being accepted for any Wikimedia project ever, given the persistent bad faith with which they have been spammed and the security threats they have posed to Wikimedia's readers.
  • Should you find yourself penalized in any search engine rankings and you believe that to be a result of blacklisting here, you should deal directly with the search engine's staff. We do not have any arrangements with any of the search engine companies; if they're using our blacklist it's purely on their own initiative.
  • I suggest you just stay away from Wikipedia and Wikimedia. Links to any other of your sites are subject to immediate blacklisting here without prior warning.
Whitelisting:  Declined
Additional blacklisting: Done
--A. B. (talk) 12:45, 31 March 2008 (UTC)Reply

Troubleshooting and problems

This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is not the section to request that an entry be unlisted (see Proposed removals above).


Discussion

Help needed

Dear all. Eagle 101 and I have been working on bots in the spam IRC channels (see #wikipedia-spam-t for talking, people there will be able to steer you to the other channels; #wikipedia-en-spam and #cvn-sw-spam). The bots are now capable of real-time cross wiki spam detection (and soon that will also be reported). It would be nice if some of you would join us there, and help us cleaning etc. as this appears to go faster than we at first expect (and I do get the feeling the en wiki is not a good starting point for finding them! --Beetstra 21:35, 22 March 2008 (UTC)Reply

Something interesting for ya all to look at. I'm going to work on making each link go to subpages, and have them updated in a way that we can comment on the subpages as well, and bring the ones that need blacklisting to the meta blacklist. I can't have the bot automatically post here, we would flood this list out, so we will have to look at them all and then link to them. Hopefully we can get all the reports in one place, the coibot reports etc. Folks more or less simple crosswiki spam is easily detectable. :) —— Eagle101 Need help? 22:55, 22 March 2008 (UTC)Reply
Bah, you probably want to see the subpage at User:SpamReportBot/test ;) —— Eagle101 Need help? 23:00, 22 March 2008 (UTC)Reply

Addition to the COIBot reports

The lower list in the COIBot reports now have after each link four numbers between brackets (e.g. "www.example.com (0, 0, 0, 0)"):

  1. first number, how many links did this user add (is the same after each link)
  2. second number, how many times did this link get added to wikipedia (for as far as the linkwatcher database goes back)
  3. third number, how many times did this user add this link
  4. fourth number, to how many different wikipedia did this user add this link.

If the third number or the fourth number are high with respect to the first or the second, then that means that the user has at least a preference for using that link. Be careful with other statistics from these numbers (e.g. good user do add a lot of links). If there are more statistics that would be useful, please notify me, and I will have a look if I can get the info out of the database and report it. The bots are running on a new database, Eagle 101 is working on transferring the old data into this database so it becomes more reliable.

For those with access to IRC, there this data is available in real time. --Beetstra 10:40, 26 March 2008 (UTC)Reply

Log weirdness

I guess it may be a caching issue but for me the log appears to end at July 2007? Editing gave me March 2008 but it ain't there now for me? --Herby talk thyme 12:16, 26 March 2008 (UTC)Reply

I've rv'd myself for now but something is going wrong??? --Herby talk thyme 14:21, 26 March 2008 (UTC)Reply
Looks to me like you put the log entry in the right section, I'm re-adding it for ya. Did you purge? ~Kylu (u|t) 16:28, 26 March 2008 (UTC)Reply
Agreed in a sense but just purged the cache & it cuts off at July 2007 for me (I even tried making it #March 2008 and got de nada). Is it just me - it has been "one of those" days :) --Herby talk thyme 17:02, 26 March 2008 (UTC)Reply
I don't see past July 2007 either :\ Mønobi 17:11, 26 March 2008 (UTC)Reply
https://wikitech.leuksman.com/view/Server_admin_log#March_26 - issues with the rendering cluster again (which would keep &action=purge from working) ~Kylu (u|t) 17:40, 26 March 2008 (UTC)Reply
Did the full ff purge & still have the same as Monobi today. I am recording the entries that I cannot log at present but I guess if this is not resolved soon alternatives of some sort may be needed. If anyone else finds (or does not find) the same it would be good to hear. Thanks --Herby talk thyme 08:46, 27 March 2008 (UTC)Reply
Leave me the log entries you want added on my talk, and I'll add them for you if you'd like. I can get around this problem. :) ~Kylu (u|t) 14:12, 27 March 2008 (UTC)Reply
Ok, sorry for archiving this. It looks like we hit some sort of limit. My suggestion is to make a second log page for the time being and start logging from that while the original bug is reported to bugzilla. —— nixeagle 02:54, 30 March 2008 (UTC)Reply

Hopefully sorted for now via Spam blacklist/LogPre2008. Of course this is a wiki so if anyone disagrees....:) Cheers --Herby talk thyme 11:38, 30 March 2008 (UTC)Reply

Crosswiki spam detection

Ok folks we can more or less detect any crosswiki spam addition. Wander over to User:SpamReportBot/cw. This is a report of all links added by only a few people across more then 3 wikis. Each section here is its own subpage, which means you can transclude them on this page, link to the specific section, etc. You can also comment on the subpages if you have further notes etc, such as "this is not spam because of X". Depending on what we all think of it, I'll transclude User:SpamReportBot/cw on this page. —— Eagle101 Need help? 00:31, 29 March 2008 (UTC)Reply

I'll also note that it automatically removes old items. Items should stay up for 2-3 days before being removed by the bot. (that is if no more links are added). If good links consistently come up, I'll come up with a whitelist mechanism that we can add links to if we deem the additions ok and we don't want to see the additions there. Please suggest improvements on how the bot reports. —— Eagle101 Need help? 01:30, 29 March 2008 (UTC)Reply
I started to blacklist a number of these and then stopped when I noticed the blacklist log is acting seriously weird. --A. B. (talk) 02:21, 30 March 2008 (UTC)Reply
Alright, thanks for your work. I'm going to continue to work on the bot and the algorithm being used, so noting false hits is important. The major one seems to be knowing accounts that edit a lot. I'll work on a fix to that tomorrow, I'm hitting the sack tonight. —— nixeagle 03:06, 30 March 2008 (UTC)Reply

XRumer spam

Well, anyone who is involved in crosswiki spam, has at some point seen Xrummer (is the best!) spam. Now he hotlinks a thumbnail for his program, as seen on [6]. Code he's using:

X-Rumer is the BEST! 
 
<img>http://upload.wikimedia.org/wikipedia/en/thumb/6/6b/XRumer_screenshot.gif/200px-XRumer_screenshot.gif</img> 

So I added the following line: \bupload\.wikimedia\.org\/.*XRumer_screenshot\.gif\b to blacklist all links to possible thumbnail sizes. although I don't know if I did it properly (and the logging system used here confuses me). So, could anyone here review if I did it properly? es:Drini 19:07, 28 March 2008 (UTC)Reply

That works. I just tried it out. (adding the link that is). —— Eagle101 Need help? 01:25, 29 March 2008 (UTC)Reply


SpamReportBot/cw feedback

First item: after a lot of checking, I went through and made comments in each section as to which bot-reported domains needed blacklisting and which looked legit. When I was all done, I saw that none of my edits "stuck" -- it was if I'd never made them.. This must have something to do with the fact that these reports are transcluded. Then I went and blacklisted 13 domains; afterward I saw others had also blacklisted some of the same links, so there was some wasted effort. Conclusion: we very much need a way to mark up these reports so we don't duplicate each others' efforts.

In lieu of marking each report, here's my feedback on some of the domains reported so far:

  • I blacklisted these:
    • tremulous.net.ru
    • logosphera.com
    • vidiac.com
    • yarakweb.com
    • img352.imageshack.us
    • ayvalikda.com
    • sarimsaklida.com
    • worldmapfinder.com
    • cundadan.com
    • bikerosario.com.ar
    • alfpoker.com
    • karvinsko.eu
    • yarak.co.uk
  • Links added to these sites looked legitimate:
    • wikilivres.info
    • unwto.org
    • en.pwa.co.th
    • villatuelda.es
  • Some others still need evaluation

All in all, SpamReportBot/cw looks like a very powerful, useful tool. --A. B. (talk) 03:28, 30 March 2008 (UTC)Reply

OK, I just figured out that if I post my comments in the bot report sections above the line that says "<!-- ENDBOT POST BELOW HERE -->", then they'll show up. I don't know if it's a good idea to do this, however -- will it screw up the bot or the transclusion? --A. B. (talk) 03:36, 30 March 2008 (UTC)Reply
The work of the bot is awesome & deserves both thanks & discussion. There seems a few issues that need addressing such as what to look at, logging etc & it would be good to see discussion here. I feel that there may be a case for listing all bot generated sites because the behaviour is "spammy". However I also think because it is bot generated and there will likely have been no warnings, that entries can & should be removed after some sensible interaction has taken place. I am well aware that others here would not share my views so I will substantially reduce my activity on this page (& Meta).
The bot - while excellent - has generated far more work that I have time for and so I will just look at dealing with the request from the people who make requests here & who I've got to know & trust if I am around. Given the vast number of admins on Meta this should not cause any problems - however Meta seems to attract many people who want be admins but are not inclined to do any of the work. If I am around I'll help but my time is short & there is much to do on Commons. Thanks --Herby talk thyme 12:25, 30 March 2008 (UTC)Reply
You do raise a valid point, as far as no warnings. Thankfully we just turned a major corner. We now have the ability to detect most spammy behavior. However now that detection and reversion is easier (SUL), we may want to evaluate what we do in response to those that add links many times.
When I first started helping in this effort, we were shooting in the dark. There was no COIBot reports, irc feeds, the crosswiki linksearch tool, or any sort of monitoring of more then one wiki at a time... thus detecting spam across multiple wikis was... pardon my language, damned hard! As such we blacklisted all we could find. This type of spam was and is sneaky as it bypasses most community's detection mechanism. Its only one link to folks on the various wikis, but added togather its across 5 or more!
Now that we have a detection mechanism, one that we can adopt should the behavior of spammers change significantly, we need to ask ourselves, should we blacklist with the same vigor? Should we attempt to assume good faith of ones that appear to us to be accidental, or in good faith? How do we go about warning someone that may never see the warning, or be unable to read the language in which the warning is placed in? In addition, we must remain ever wary of en:Joe jobs.
These are questions that need to be answered, and Herbythyme is right on the ball hinting at these here and elsewhere. Its perfectly valid to keep our response the same as it always was, but this may not be the best course of action. I don't know for sure what is. Please discuss your thoughts to this below my comment, or in its own section. :) —— nixeagle 18:20, 30 March 2008 (UTC)Reply
Someone will have to remove the blacklisted links from the wikis. Can that be done by a bot and/or can a bot be set up to give information on the affected wikis about where the blacklisted links are, so the local community can remove the links themselves? Removing spam is a tedious task, and sometimes one feels one is as much infringing with the local communities as is any spammer. If possible the local communities should evaluate the blacklisted links themselves, and remove the ones they don't want, and either strip or whitelist the others. I realize that might not be very realistic. For Commons there is the CommonsTicker and CommonsDelinker. Is it possible to handle the blacklisted links in a similar way? --Jorunn 13:49, 30 March 2008 (UTC)Reply
Possibly it could be done by bot... I can work on writing this if its wanted. SUL will make things much easier. I usually just click the diff links and click undo on each of the ones I blacklist. In otherwords I don't blacklist things I'm not willing to undo the link additions to. —— nixeagle 17:54, 30 March 2008 (UTC)Reply
A.B. - As far as your edits not sticking on the transcluded pages... can you show me an example? I can't fix it unless I can see an example of the problem. :S It will be useful down the road to have the blacklisted or not portion in the page itself, so this should work without any problems... —— nixeagle 18:03, 30 March 2008 (UTC)Reply
Replying to myself again: AB - "<!-- ENDBOT POST BELOW HERE -->", posting above that means the bot will overwrite your comments should there be future link additions from that domain.
Also, A.B. and everyone else interested, I just modified the algorithm to remove 2 out of 4 identified false hits. I'll look at the other two, but I'd like to see this run for a day or so and see what crops up. Please do attempt to comment on the actual sub pages. —— nixeagle 19:31, 30 March 2008 (UTC)Reply