Talk:Spam blacklist

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by A. B. (talk | contribs) at 19:11, 24 April 2009 (→‎fashionistas.me: heading). It may differ significantly from the current version.

Latest comment: 15 years ago by A. B. in topic Proposed additions
Shortcut:
WM:SPAM
WM:SBL
The associated page is used by the MediaWiki Spam Blacklist extension, and lists strings of text that may not be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any meta administrator can edit the spam blacklist. There is also a more aggressive way to block spamming through direct use of $wgSpamRegex. Only system administrators can make changes to $wgSpamRegex, and its use is to be avoided whenever possible. For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.
Proposed additions
Please provide evidence of spamming on several wikis. Spam that only affects a single project should go to that project's local blacklist. Exceptions include malicious domains and URL redirector/shortener services. Please follow this format. Please check back after submitting your report, there could be questions regarding your request.
Proposed removals
Please check our list of requests which repeatedly get declined. Typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects. Please consider whether requesting whitelisting on a specific wiki for a specific use is more appropriate - that is very often the case.

Please sign your posts with ~~~~ after your comment. This leaves a signature and timestamp so conversations are easier to follow.

Completed requests are marked as {{added}}/{{removed}} or {{declined}}, and are generally archived (search) quickly. Additions and removals are logged.

Other discussion
Troubleshooting and problems - If there is an error in the blacklist (i.e. a regex error) which is causing problems, please raise the issue here.
Discussion - Meta-discussion concerning the operation of the blacklist and related pages, and communication among the spam blacklist team.
#wikimedia-external-links - Real-time IRC chat for co-ordination of activities related to maintenance of the blacklist.

snippet for logging: {{sbl-log|1470807#{{subst:anchorencode:SectionNameHere}}}}

Proposed additions

This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users on multiple wikis. Completed requests will be marked as {{added}} or {{declined}} and archived.

catholic-hierarchy.org



A post on my enWP talk page en:User talk:JzG raises quesitons about this domian, which appears to be all over some language projects. I checked it out a bit, it does not look to me to be an appropriate source, much of what is written has the look of personal essays. I don't know what action we can take here, but the request on my tallk page does seem to be made in good faith and to have merit as a call to action of some sort. JzG 14:09, 14 April 2009 (UTC)Reply

I would like to add that the situation is dramatic also in the it.wikipedia, as more than 3.000 articles use such site as source, and most of the articles use that as ONLY source, which I find definitely not reliable (for several reasons, last but not least because is a privately-owned site which doesn't represent any official position or statement, unlike i.e. the Vatican City's official website). Blackcat 09:24, 15 April 2009 (UTC)Reply
There's also santiebeati.org, another privately-owned site (biographies and agiographies of Saints). It's not an official website, nonetheless is hugely used as source. Blackcat 08:53, 16 April 2009 (UTC)Reply
I agree that this is not a reliable source, it does have a disclaimer saying "This web site is not officially sanctioned or approved by any Catholic Church authority. The contents are purely the responsibility of David M. Cheney." so that the site doesn't really pretend to be more than it is. Judging which sources are reliable and which are not should be done by the editors of the article(s) in question, and I don't think we should (or could, given our current guidelines) blacklist such a site here. So I would decline adding this the blacklist, while still appreciating that the request was made in good faith and that the concerns raised are valid. Finn Rindahl 17:14, 17 April 2009 (UTC)Reply
I agree with Finn, meta can't start making binding decisions on all projects as to what is a reliable source through the blacklist. That is a decision that must be taken on a local level. I don't think it appropriate to add it to the blacklist unless it can be shown that it has been spammed massively across projects and that some projects have just been slow in removing all instances of it. I presume that isn't the case here? WJBscribe (talk) 17:48, 17 April 2009 (UTC)Reply
Looking at a sample article it:Diocesi di Abaetetuba it seems that the link to the site was present in the first revision of the page which was automatically generated by a bot.[1] I suspect that many other articles have the same characteristic. I see no evidence that there is any current spamming occuring, just mass insertion of material from a site which has amassed a lot of good information.--Salix alba 20:08, 17 April 2009 (UTC)Reply
Yes. The problem is that editors who make their best efforts to build reliable and accurate articles make also big efforts to find accurate and reliable source. Articles based on these sources are not reliable themselves, even because that site is very often the only source on which they are based. In my honest opinion blacklisting that source would push editors towards looking for a better source (possibly a definitely reliable one, like i.e. Holy See, or Catholic press or similar) or - as opposite - purge articles (excuse my English, am writing fast and have no time to review). Blackcat 22:10, 17 April 2009 (UTC)Reply

 Declined - the blacklist is not for making this sort of content decision. Maybe it's a good link, maybe it's not - but the issue we consider here is spamming.  — Mike.lifeguard | @en.wb 21:55, 23 April 2009 (UTC)Reply

URL redirector



Not quite the usual redirector blacklist request. The domain is a redirector, but is used by a few legitimate sites as a "clean" version of an otherwise messy URL and is also a blog service. That does not mean we should not blacklist it, but there are a lot of links. I've asked the original reporter on enWP to collect some for whitelisting, but this one is already being used to evade blacklists so we probably need to add it sooner rather than later. JzG 20:04, 14 April 2009 (UTC)Reply

Added Added - blacklisting doesn't affect existing links, so cleanup can happen afterwards.  — Mike.lifeguard | @en.wb 02:53, 15 April 2009 (UTC)Reply

MW.org spam



















 — Mike.lifeguard | @en.wb 02:52, 15 April 2009 (UTC)Reply

Added Added  — Mike.lifeguard | @en.wb 16:07, 21 April 2009 (UTC)Reply

santiebeati.org



As well as catholic-hierarchy.org, this is a domain (written in Italian) that has no sources, is composed mainly of personal essays and is not an official site related to any Catholic Church's institution. Nonetheless is used as source. Blackcat 15:36, 17 April 2009 (UTC)Reply

Not done - This is a content issue, not a spamming issue.  — Mike.lifeguard | @en.wb 16:15, 21 April 2009 (UTC)Reply

slki.ru



Url shortener Track13 0_o 14:38, 18 April 2009 (UTC)Reply

Added Added  — Mike.lifeguard | @en.wb 18:20, 19 April 2009 (UTC)Reply


pamukkale.org.tr









--A. B. (talk) 01:52, 21 April 2009 (UTC)Reply

Added Added  — Mike.lifeguard | @en.wb 16:18, 21 April 2009 (UTC)Reply


fashionistas.me

English Wiktionary and Wikipedia









--A. B. (talk) 16:53, 24 April 2009 (UTC)Reply

Proposed additions (Bot reported)

This section is for domains which have been added to multiple wikis as observed by a bot.

These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see Spam blacklist/Help#COIBot_reports. Reports will automatically be archived by the bot when they get stale (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is COIBot).

Sysops
  • If the report contains links to less than 5 wikis, then only add it when it is really spam
  • Otherwise just revert the link-additions, and close the report; closed reports will be reopened when spamming continues
  • To close a report, change the LinkStatus template to closed ({{LinkStatus|closed}})
  • Please place any notes in the discussion section below the HTML comment

COIBot

The LinkWatchers report domains meeting the following criteria:

  • When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
  • When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
  • If ALL links are added by IPs, and the link is added to more than 1 wiki
  • If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.
COIBot's currently open XWiki reports
List Last update By Site IP R Last user Last link addition User Link User - Link User - Link - Wikis Link - Wikis
vrsystems.ru 2023-06-27 15:51:16 COIBot 195.24.68.17 192.36.57.94
193.46.56.178
194.71.126.227
93.99.104.93
2070-01-01 05:00:00 4 4

Proposed removals

This section is for proposing that a website be unlisted; please add new entries at the bottom of the section.

Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{removed}} or {{declined}} and archived.

See also /recurring requests for repeatedly proposed (and refused) removals.

The addition or removal of a domain from the blacklist is not a vote; please do not bold the first words in statements.

aceshowbiz.com



See discussion at this wikipedia page. - Peregrine Fisher 20:12, 8 April 2009 (UTC) Also, could you post a note on my wikipedia talk page when you comment? I don' check meta very often. - Peregrine Fisher 20:15, 8 April 2009 (UTC)Reply

Due to past problems with excessive linking to this domain, I do not believe this request should be fulfilled. However, if you believe that links to this domain will enhance the content, you can request whitelisting of specific URLs for specific uses on w:en:MediaWiki talk:Spam-whitelist.  — Mike.lifeguard | @en.wb 18:35, 13 April 2009 (UTC)Reply
Agreed. We should add this one to /recurring requests as well, I would think. JzG 19:56, 14 April 2009 (UTC)Reply

wikipedia.un.mythe.over-blog.com



This is a critical blog, often excessive if not worse, against Wikipedia, mainly Wikipedia in French. There was a discussion yesterday at our "Village Pump" fr:Wikipédia:Le_Bistro/10_avril_2009#Blocage_automatique_des_pourriels_de_la_liste_noire where appeared an obvious majority approving the block of this site, with a few dissident voices, some from experienced Wikipedians.

This site has been blacklisted in 2007 by David Monniaux with comment "site spams tons of articles, policy and discussion pages with links to this blog" [2] ; as far as I remember this blacklisting has not been discussed here, though I discovered there was a discussion in 2007 on our sysop noticeboard : [3] (see section "Pages protégées pour pourriel"), again with a majority strongly in favour of blacklisting and with significant discordant

Personally, I don't think there are really serious spamming problems, or the spamming problems in 2007 were local in time and linked to disruptive editors and did only marginally connect to the "Main" namespace. As can be seen for instance by the explanation given by Kropotkine_113 on our Village Pump, some other motives seem to have been implicit in this blacklisting ("parce qu'il diffame, insulte et traîne dans la boue des contributeurs, détourne des propos, manipule l'historique de ses commentaires pour faire régner la confusion").

Note that this site is obviously useless on "Main" namespace but could obviously be used on Talk pages, and is quite often indeed quoted on Talk pages by various Wikipedians who are forced by this blacklisting to give a non-clickable link (a few random examples here or there).

I should add that, again very personally, I think this blacklisting is unproductive : it does not forbid good faith editors to give a (non-clickable) link to this site on a Talk page when they find it useful, but it gives a bad image of "censorship". Since this censorship is unefficient the cool think to do would be to bring it to an end.

For all these reasons, I would appreciate an independant review of David Monniaux's decision by some Meta admin, ideally somebody reading French but not active on :fr Wikipedia.Touriste 20:45, 11 April 2009 (UTC)Reply

A few additional facts which can help to make your decision :

  • I have no relation to the blog whose delisting I query (except as a reader), and am (hopefully) a "trusted, high-volume editor" on :fr (sysop with several thousand edits) as suggested by the delisting rules ;
  • I don't really assert that removing the blacklisting would be "in support of our projects" but rather than keeping it is counter-productive ;
  • I took more time today to try to find out how often there had been spamming sprees towards wikipedia.un.mythe. I found only one, four months before blacklisting, indeed limited both in size and in time : [4] ;
  • I checked throughout David.Monniaux contributions here and on fr and the state of his Talk pages (here and on :fr) before the date of blacklisting (May 30th 2007) to see if the question had been subject to debate. It does not seem to have been ; I simply noticed that roughly at the same time when he added wikipedia.un.mythe to the blacklist, DM erased a substantial list of external links in the Wikipedia namespace of :fr towards critical sites, including one to wikipedia.un.mythe : [5] ;
My first take (though looking at this only briefly) is that the entry should be moved to frwiki's blacklist as there seems to be no concern about spamming elsewhere. It'd be up to the frwiki community to discuss whether to leave it blacklisted or not.  — Mike.lifeguard | @en.wb 00:35, 13 April 2009 (UTC)Reply
It seems OK to me, a debate in French with editors knowing exactly what this blog is would be more reasonable - local blacklists did not exist when DM blacklisted this blog here, and no difficulty has been noticed there. If I get a green light here, I'll add it into local :fr blacklist, inform you that it is done and you'll remove it there ; debate will keep on on :fr you shall not have to bother reading hundreds of pages in French. Touriste 08:34, 13 April 2009 (UTC)Reply
Sounds good - just let us know when it's been added to frwiki & it'll be removed here.  — Mike.lifeguard | @en.wb 18:19, 13 April 2009 (UTC)Reply
That's done [6], you can remove it from this meta-page now. Thanks ! Touriste 18:28, 13 April 2009 (UTC)Reply
Removed Removed  — Mike.lifeguard | @en.wb 18:28, 13 April 2009 (UTC)Reply
Can't say I agree with this. Quoting this in talk pages is about as useful as quoting Wikipedia Watch or trash-sites like that.
Also, from what I saw on the french discussion, almost everyone is against a de-blacklisting of any kind. It's far from "a few dissident voices".
And to finish, Wikipedia isn't the only french project : at least, the french wikibooks/wikiversity/wiktionary have to be protected too.
DarkoNeko 07:17, 14 April 2009 (UTC)Reply
Questions:
  1. Does this site attack individual editors
  2. Is this site a forum with postings from a number of different contributors or is it the project of one or two editors?
  3. What is the history of this site regarding breaching Wikipedia editors' privacy through "outing" (revealing editors' off-line, real world identities)? If this is a forum with multiple contributors and a Wikipedia editor is "outed", do the sysops quickly delete the post?
  4. I suggest checking other French projects (wiktionary, etc.) for links to this site. Also, any French-related languages (Occitan, etc.)
--A. B. (talk) 13:04, 14 April 2009 (UTC)Reply
  1. yes. Not mainly in the posts themselves but in the comments (posted by banned trolls and the maintener faking another identity).
  2. It's a blog, but the "comments" system could fit your description of a forum (because of point 1).
  3. I'll have to ask more ifnormed people on that one (they'll complete below). Also, apart from the maintener, the only recourse is to write mails to over-blog's abuse address, not too successfully.
Darkoneko 17:02, 14 April 2009 (UTC)Reply
My own answers:
  1. Yes, many times.
  2. It's a blog hosting only attacks against Wikipedia, in the posts and, many times, in the comments, stinking as a garbage.
  3. See above.
  4. Ask the other French projects has no sense. Alithia, "webmaster" of this blog, never speaks about anything else than Wikipedia (always ti say nonsense) and other French Wikimedia projects never talk about her blog. Also, Occitan (spoken by a very little minority in France) is not a "French-related language" (a huge majority of French speakers do not understand Occitan), no more than, for exeample, Catalan or Asturian (Roman languages, relatively close to Occitan), spoken in Spain.
I request the quick return of this address \bwikipedia\.un\.mythe\.over-blog\.com in the blacklist. The request of Touriste cannot be considered as reflecting the French users' wish. And almost no one, at the moment, agreed with his proposition, on wp-FR, to remove any prohibition about this site. See: fr:Discussion MediaWiki:Spam-blacklist#Rapatriement en local du blog d'Alithia. Hégésippe | ±Θ± 17:55, 14 April 2009 (UTC)Reply
Contributors who did not (each one with his own commentary) really agree, on wp-FR, with Touriste proposition: Dionysostom (not sysop), Hégésippe, Ice Scream (Like tears in rain), Ludo29, Chandres (not sysop), Kropotkine 113, Ceedjee (not sysop), Dereckson, Xic667, Gdgourou, Diti, Sardur and Lgd).

Note I've re-added this pending further discussion in light of Darkoneko's statements above. Thanks  — Mike.lifeguard | @en.wb 17:46, 14 April 2009 (UTC)Reply

Though my opinion on the right thing to do has not changed during this debate, for the sake of sanity I withdraw my request. No objection any longer to the permanent listing of this blog here (or more exactly, I keep them for myself). Touriste 18:05, 14 April 2009 (UTC)Reply

  • It sounds as if the reason David added it still applies now - i.e. that without blacklisting there will be abuse which is hard to control. I don't see any good reason for removing it, we have no need of attack blogs as sources. JzG 19:59, 14 April 2009 (UTC)Reply

neosmut.com



This is getting caught unintentionally by smut\.com\b and seems to be a reasonable site. Can we get a \b at the front of smut.com or something? (Originally requested at enwiki.) Stifle 15:33, 17 April 2009 (UTC)Reply

Done — Mike.lifeguard | @en.wb 15:38, 17 April 2009 (UTC)Reply

www.yoyita.com/gueguense.htm



It's a description of the nicaraguan play "el Güegüense", that dates from pre-Columbian times, and has a translation of the play from the original náhuatl to spanish. It's used in the spanish Wikipedia in Literatura nicaragüense and el Güegüense. Comu nacho 19:30, 21 April 2009 (UTC)Reply

Given the domain was added after spamming, what is the problem with whitelisting?  — Mike.lifeguard | @en.wb 21:56, 23 April 2009 (UTC)Reply

Troubleshooting and problems

This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is not the section to request that an entry be unlisted (see Proposed removals above).
none currently.

User: namespace abuse

This section is for reporting abuse of userpages for promotional purposes; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Abuse across several wikis should be reported here; please provide links to example behaviour. Completed requests will be marked as {{added}} or {{declined}} and archived.

Discussion

Noticeboard/Helpdesk

I remember having a look at the SBL work a year ago or so, but my curiousity was quelled since it appeared to be way to complicated. This may have been discussed before, but I feel we lack some kind of noticeboard/helpdesk for spam related work. What I'm thinking of is a page where users

  1. involved with removing crosswikispam can request help from each other (Could someone please check User:COIBot/Xwiki/example.net, a local user has complained about removal of links? or How come Erwin's tool doesn't show the last additions here [linky]? etc). When I need help with questions like this I ping someone at #wikimedia-external-linksconnect, but not everybody wants to use IRC. In order to recruit (and keep) non-irc spam-fighters this could be of help. I don't think I would actively have joined here had I not found my way (guided by Mike...) to the irc-channel.
  2. not familiar with crosswikispam could ask questions without having to worry to much about templates/format/where actually to post it. (example.net has been spammed at nb.wiki and nn.wiki, maybe somehing you should look into? or Why can't I add example.com/de at fr.wiki when I can add it at de.wiki) etc etc). Again remembering a year back or so, I came across something that looked like spam, but instead of using half an hour figuring out how/where to report it I sent Herby an e-mail ;) Finn Rindahl 15:41, 2 April 2009 (UTC)Reply
Seems a good idea to get first aid when necesaire and I support Finnrind's idea. SBL is complicated and it takes more than 2 minutes to understand how to do and not all the people is always avalaible via IRC. Best regards. —Dferg (talk) 16:22, 2 April 2009 (UTC)Reply
I like the idea, but I don't think we need a new page? How about Talk:Spam blacklist/Help? --Erwin 16:36, 2 April 2009 (UTC)Reply
Sure, that page would do just fine. If noone objects I'll make some minor changes at this an that åage to clearer invite users to ask spam related questions there.Finn Rindahl 18:34, 5 April 2009 (UTC)Reply
Why are we relegating such queries to a different page than this one? Surely if someone has a domain for consideration or a more general question they can find a somewhat-suitable location here (and if not, we will refactor & move it as needed). I would much rather see such things dealt with here than hidden away on some other page where they won't be noticed.  — Mike.lifeguard | @en.wb 04:16, 12 April 2009 (UTC)Reply
Well, after having had a secong look around and noticed that various pages has been suggested for this purpose previously I didn't follow up my own suggestion here. So here's a new suggestion: is it possible to make the list of coibot-reports by default hidded - that list looks like gibberish and a lot of it to any new user, and could scare of any user taking a look at this page. Finn Rindahl 13:48, 14 April 2009 (UTC)Reply
Sure, just put some collaps-span around it. Makes sense, actually, most visitors are interested in the 'normal' additions and removals. On the other hand, if they come to report a site they noticed and see that COIBot also caught it could take work away from them (just add a comment, practically all data is in the report). --Dirk Beetstra T C (en: U, T) 13:53, 14 April 2009 (UTC)Reply
Done & I agree that's an obviously good idea for usability's sake regardless.  — Mike.lifeguard | @en.wb 18:07, 14 April 2009 (UTC)Reply