User talk:Beetstra/Archives 2019

From Meta, a Wikimedia project coordination wiki

monitoring of problematic server /24?

Hi. Seasons greetings. I know that I can monitor an IP address at a time, however, is there the means to monitor an IP range? I can see some spambots have a nice target range in 209.124.86.0/24 and I am uncertain the best way to keep an eye on these, beyond monitoring each single IP address. Ideally I would love to kill these early and without intervention, though I have no idea how we can do that beyond blacklist on sight. Thanks for all you do.  — billinghurst sDrewth 01:50, 25 December 2018 (UTC)

Another problematic is the host range 85.13.141.0/24, see some at User:COIBot/XWiki/femmestyle.or.at  — billinghurst sDrewth 11:18, 13 January 2019 (UTC)
Hmm the last is just spotty, though it probably is server farm that has a range spamservers in among others, and they have a range of spam domains hosted among others. Is this something that is a negative count, though not necessarily a complete black mark? :-(  — billinghurst sDrewth 11:34, 13 January 2019 (UTC)
@Billinghurst: didn't we have a possibility to check usernames .. You can probably just check for editors who have a conflict-of-interest with '209.124.86' and '85.13.141'... --Dirk Beetstra T C (en: U, T) 12:25, 13 January 2019 (UTC)
spambots, so disposable one-use usernames. :-(  — billinghurst sDrewth 13:34, 13 January 2019 (UTC)

Command alias

Hi and happy new year. Would it be possible to create quickreport as an alias for quickcreate. I often end typing quickreport and noticing that it is not a command understood by the bot (my fault I know). Thanks for your help. Best regards, —MarcoAurelio (talk) 16:27, 19 January 2019 (UTC)

@MarcoAurelio: I will do that when I have time. There a couple of things that I need to work on when I have time. —Dirk Beetstra T C (en: U, T) 17:34, 19 January 2019 (UTC)
There's no urgency at all. Thank you. Best regards, —MarcoAurelio (talk) 17:37, 19 January 2019 (UTC)

What can we do to prevent problematic IP webserver targets?





Looking at

just shows spam homes where our spambots point. Do you have a better idea of how we can do some prevention rather than just maybe having a better target to check?  — billinghurst sDrewth 05:14, 9 February 2019 (UTC)

@Billinghurst: for starters, poke the ip of the server in a LinkSummary template, if it is not too much, COIBot will list all added domains on that server IP. Second, you may want to look for a reverse IP lookup service and collect all domains on that server, and preemptively blacklist them. (and maybe look at the whole /24 of IPs). Pinging User: MER-C, I guess he has more techniques. —Dirk Beetstra T C (en: U, T) 04:08, 10 February 2019 (UTC)
Thanks Beetstra. The whole range isn't an issue, it looks like it is just these two from my review, though the lag of COIBot makes it a little trickier for the while. I had pushed xwiki reports, though from IRC it seems that push failed for IP addresses, I will try again. I will consider the pre-emptive blacklisting.  — billinghurst sDrewth 10:01, 10 February 2019 (UTC)
@Billinghurst: I have just poked them. Bot seems to be busy with something else .. --Dirk Beetstra T C (en: U, T) 10:18, 10 February 2019 (UTC)
Yep it seems to spit out reports in batches, I was presuming that it has quiet times between imports, and spat out the reports. It is fine, it makes me do other things. FWIW I have asked Maggie Dennis if the WMF still has some domain lookup subscriptions available, they did years ago. Noting that IP Address 66.96.147.111 Reverse IP 9,346 websites use this address.  — billinghurst sDrewth 10:29, 10 February 2019 (UTC)

Linkwatcher reporting issues

Hi. LiWa3s are reporting Started linkanalysers - None seemed to be alive

Status is reported as …
<LiWa3_2> Syslogs: diffreader: 8940 secs. linkanalyser: 69 secs. linkparser: 225 secs. linkreporter: 5327470 secs. linkwatcher: 261 secs. output: -1 secs. script: -1 secs.
<LiWa3_3> LW: 165 days, 07:54:53 hours active; RC: last 34 sec. ago; Reading ~ 830 wikis; Queues: P1=0; P2=1; P3=0 (738 / 16759); A1=83; A2=106 (0 / 0); M=0 - In LinkWatcher edit-backlog: 0 files (0 lines) and in analyser backlog: 330 files (82536 lines).

It is collecting a few files. Thanks.  — billinghurst sDrewth 02:54, 16 February 2019 (UTC)

and d'oh, no COIBot in IRC  — billinghurst sDrewth 10:11, 16 February 2019 (UTC)

I hope I remember tomorrow morning to solve it. —Dirk Beetstra T C (en: U, T) 17:34, 16 February 2019 (UTC)

I see reports about the labs servers appear resolved, XLinkBot and UnBlockBot have reppeared a few days ago. Can we speculate on the time for recovery of COIBot and linkwatchers?  — billinghurst sDrewth 23:28, 22 February 2019 (UTC)
Something is back and the wheels are starting to rumble, though there does appear to be an issue with monitor list regex. ... see User:COIBot/XWiki/b9rbcforum.info. I am not near IRC to be able to investigate.  — billinghurst sDrewth 03:31, 25 February 2019 (UTC)
@Billinghurst and MarcoAurelio: I have restarted coibot and linkwatcher on the new servers this morning (with less memory requested than I did on the old server - they might crash regularly now). The new servers have newer software, and some of the hardcoded regexes (the ones filtering the links) needed to be updated as the new perl seems more strict on regexes. As far as I could see, all necessary modules are however running fine.
The regexes that you mention, SDrewth, seem to be in the db. Those regexes are however obviously broken (missing a 'b'?). We would need to try and track those down and replace them. --Dirk Beetstra T C (en: U, T) 05:17, 25 February 2019 (UTC)
To restate my IRC comment, I searched for these in monitor list (per SWMT IRC bot guidance) and couldn't find them, though I couldn't find anything in any search of monitor list, so wonder what else I should/can be doing to resolve.  — billinghurst sDrewth 21:57, 25 February 2019 (UTC)
I’ll try to access from console one of these days. No big deal. —Dirk Beetstra T C (en: U, T) 03:19, 26 February 2019 (UTC)

Not sure what COIBot is currently doing. It will be fed, it will churn through its food, then it doesn't submit its homework. It is still in wiki and irc, and will do a quickcreate, though that seems all.  — billinghurst sDrewth 11:46, 1 March 2019 (UTC)

Movement. A few reports have been generated. Will continue to watch.  — billinghurst sDrewth 07:02, 3 March 2019 (UTC)
@Billinghurst: I guess it has been busy backparsing some blacklistlog .. --Dirk Beetstra T C (en: U, T) 07:57, 3 March 2019 (UTC)
(the thing is .. I did not touch it .. unlikely that it is doing the same now, and it is again MIA for 2 hours). --Dirk Beetstra T C (en: U, T) 08:13, 3 March 2019 (UTC)
It didn't seem to be backparsing as then it accumulates reports as waiting (!backlog shows this), whereas on this occasion they were being processed through, though not writing to the wiki. Who knows!  — billinghurst sDrewth 12:38, 3 March 2019 (UTC)
Back to quirkiness. Reports not being written to meta, quickcreate starts report though no follow-up. IRC shows things queued and apparently ticking down, though nothing to meta.  — billinghurst sDrewth 10:42, 7 March 2019 (UTC)
I am curious why this happens, it seems to work in bursts. It has a lot of quirks now on the new server. Will have to dedicate some time soon on it .. --Dirk Beetstra T C (en: U, T) 11:08, 7 March 2019 (UTC)
FWIW this is what is queued
<COIBot> 9 records waiting: 8 XWiki, 1 Local, 0 Redirect, 0 Poked, 0 Meta, 0 IP, 0 requested
<COIBot> Waiting: wallichresidence.corecentralcondo.com.sg, mobigarage.com, cashify.in, urbanclap.com, doubtfreesupplements.com, taxicopii.ro, progettocomposta.eu, regard-travel.com, koprubas
 — billinghurst sDrewth 11:36, 7 March 2019 (UTC)
and now quickcreate fails. COIBot is mute when the page does not exist and does not queue the page, though it does responds to the command when the target page exists. Report command queues successfully though nothing writes.  — billinghurst sDrewth 10:03, 9 March 2019 (UTC)

For IRC, is COIBot far away? It disappeared from IRC a few days ago.  — billinghurst sDrewth 09:57, 18 March 2019 (UTC)

Guaranteed! When you ask IT support it suddenly works perfectly. IOW COIBot is back.  — billinghurst sDrewth 10:17, 19 March 2019 (UTC)
@Billinghurst: From tomorrow I have a bit more time to look and to follow the logs. The bot quite often seems to think it is logged out, and for some reason it does not want to update the linkreportlist. Then it seems to crash from nowhere. I suspect some memory issue on the new server. (and as a side, I hope to have some time to work on some old requests/bugs and work on the spamblacklistlog-parsing). --Dirk Beetstra T C (en: U, T) 10:23, 19 March 2019 (UTC)

filter

You created your filter at enWP and tested here. Get your own your tricks ;-)  — billinghurst sDrewth 12:22, 20 March 2019 (UTC)

@Billinghurst: No, the filter is for mainspace-en.wikipedia only. You could consider to make it a global filter if we can't get the blacklist to work. Your last edits to the blacklist did not seem to work (or did you tweak them further?). --Dirk Beetstra T C (en: U, T) 12:32, 20 March 2019 (UTC)
The edits worked fine, though if you test plain text addition, rather than a url, one's expectation that it will work are exaggerated! special:log/spamblacklist/billinghurst shows success.  — billinghurst sDrewth 12:35, 20 March 2019 (UTC)
@Billinghurst: I tested it as well, it is dead now. I noticed that COIBot is decoding the url, and does not search appropriately in the db then. Maybe I need to have a look at that as well. But first lets see if I can fix the saving issues with COIBot that it has lately. Will be online in a bit. --Dirk Beetstra T C (en: U, T) 12:39, 20 March 2019 (UTC)

user:Jon Kolbert‎ to have some COIBot rights

Hi. With the settings.css file still in a differed state it is untouchable. At some point, would you please give new steward "Jon Kolbert‎" the ability to create reports with COIBot and similarly add to User:COIBot/Poke. Thanks. I will catch them in IRC and run through the commands in SWMT/IRC.  — billinghurst sDrewth 22:22, 24 March 2019 (UTC)

My cloak is "@wikimedia/JonKolbert" btw. Thanks :-) Jon Kolbert (talk) 22:32, 24 March 2019 (UTC)
@Billinghurst and Jon Kolbert: sDrewth, can you try, you should be able to edit all now (I can for sure, because I am an iAdmin, but I have moved things now and want to check if non-iAdmins can :-) ). --Dirk Beetstra T C (en: U, T) 17:23, 25 March 2019 (UTC)
I will be able to do so not as it is no longer a css file, and I won't get the page model content warnings. I have rights to edit .js/.css files so that wasn't my issue.  — billinghurst sDrewth 20:48, 25 March 2019 (UTC)
Edited fine. Noting that I also +1 -1 of the wikis listed in not normal. I have not restarted commander as I am not near IRC, and probably best left with you at this moment.  — billinghurst sDrewth 20:58, 25 March 2019 (UTC)

No linkwatchers

There are no visible linkwatchers in IRC. Restarts had no visible effect.  — billinghurst sDrewth 06:38, 7 April 2019 (UTC)

@Billinghurst: crap, the instance seems dead? --Dirk Beetstra T C (en: U, T) 07:08, 7 April 2019 (UTC)
@Billinghurst: it seems that tools-sgeexec-0928 is totally dead. It also disallows me to 'qdel' the linkwatcher task (job 857156). I will see if I can ping someone on irc later (but please feel free to beat me to that task). --Dirk Beetstra T C (en: U, T) 14:07, 7 April 2019 (UTC)

In IRC, by the minute for the last half an hour LiWas have started and crying for linkanalysers and this is one skill that COIBot will not let me achieve. My "add" requests are specifically rejected as "you can't do that". I went the hard route of restart of the LAs, which is a task it does let me do, though it continues to resolve issue. If syslog is meant to talk for linkanalysers, that is a no info one.
LW: 33 minutes 08 seconds active; RC: last 0 sec. ago; Reading ~ 831 wikis; Queues: P1=0; P2=0; P3=0 (942 / -1); A1=1713; A2=202 (0 / 0); M=0 - In LinkWatcher edit-backlog: 0 files (0 lines) and in analyser backlog: 8 files (2001 lines).  — billinghurst sDrewth 11:06, 10 April 2019 (UTC)

freebasics.com?

Hi. Do you know anything about 0.freebasics.com User:COIBot/XWiki/0.freebasics.com. For me all these links added seem to just flow back to one central page of https://connectivity.fb.com/. I am not sure whether there are different connections for different places. Before I start putting something on Talk:Spam blacklist as this being a useless set of links, I thought I would see what you knew.  — billinghurst sDrewth 22:03, 29 April 2019 (UTC)

@Billinghurst: .. https-www-bbc-com.0.freebasics.com ... ?? what is this, www.bbc.com proxied through freebasics.com? I've poked:


I am curious what COIBot is analyzing there.
Note, we are edit-filter hard-blocking university proxies on en.wikipedia. That concerns people that post links on en.wikipedia while inside their university proxy where links become like these 'www-sciencedirect-com.proxy.<university>.edu' or similar. That link is utterly useless outside of that proxy (you just get a login page, no info as to where it is supposed to go), and while some links can be 'decoded' back to an original, some are basically 'library.proxy.<university>.edu/<garbagecode>' and you really have no clue what it is. Is this similar? --Dirk Beetstra T C (en: U, T) 08:37, 30 April 2019 (UTC)
As I got the crap end point, and know little about the service, that was pretty much what I was asking whether you knew. (And I have no issue about the use of proxies, I have generally just left notes for people.) The enWP article w:Internet.org has a little, and I think that it relates to having a mobile app, and clicking the link basically acts as a proxy, or the url as a faux protocol for an app. Though best guess is what I am working with here. While it may be a useful for someone on the service, it seems useless for anyone outside of the service. I suggest that we stick it up on blacklist talk, and leave it as an open discussion for a while to see who can bring what to the argument.  — billinghurst sDrewth 21:31, 30 April 2019 (UTC)
@Billinghurst: I am likely to hit ‘add’ regardless ... —Dirk Beetstra T C (en: U, T) 03:18, 1 May 2019 (UTC)
Sure, though giving the opportunity for discussion outside of my one-eyed observations is useful. :-)  — billinghurst sDrewth 05:36, 1 May 2019 (UTC)
btw, I hit "add" a week or so ago. When someone started using Polish wikipedia via freebasics as references I thought we reached that point.  — billinghurst sDrewth 08:29, 7 October 2019 (UTC)

Automating tracking backlinks

Hi. It would be really useful if we could find a means to automate some backlinks. The goose down spammer seems to whack away and they create new domains as shown at Special:WhatLinksHere/User:COIBot/LinkReports/200.1.25.44. If there was a means to better investigate or to report on User:COIBot/LinkReports/200.1.25.44 then things would be somewhat useful. Just thoughts.  — billinghurst sDrewth 08:04, 7 October 2019 (UTC)

@Billinghurst: on IRC, ask for an 'ipreport 200.1.25.44' (I think that was the right command). --Dirk Beetstra T C (en: U, T) 08:15, 7 October 2019 (UTC)
"report ip x.x.x.x" ? Pushed and looking to be sitting in the queue, will see how they go. Thanks.

Would it be possible to set "monitor ip x.x.x.x reason" so where IP is found in a generated new report that new report sits as an open report for review?  — billinghurst sDrewth 08:24, 7 October 2019 (UTC)

I have a couple of days in the end of the month to do some coding, lets start user:COIBot/Wishlist. —Dirk Beetstra T C (en: U, T) 09:00, 7 October 2019 (UTC)

Link Report

Can you do a report on additions of triviatribute.com ? Thanks Nunabas (talk) 15:15, 25 October 2019 (UTC)

Not Beetstra but check here in a few minutes. Praxidicae (talk) 16:16, 25 October 2019 (UTC)

COIBot log files on server are big

Hi. Not sure whether it matters, however, just looking at COIBot on the server, and the syslog files are quite a size. So just mentioning.  — billinghurst sDrewth 10:55, 20 November 2019 (UTC)

@Billinghurst: in the directory for each of the three bots there is a 'clear.sh' (you can run it by typing './clear.sh'), that cuts the files down to the last ~500 lines or so. If there have been problems lately then they may contain useful information, otherwise (like now) they can just be cleared. --Dirk Beetstra T C (en: U, T) 11:07, 20 November 2019 (UTC)
thumbsup Done  — billinghurst sDrewth 11:45, 20 November 2019 (UTC)

LIWa3, to note I acted

LiWs3 was missing from IRC channels, and got it back from console, though maybe not according to Hoyle. If it is meant to be in #wikimedia-external-links, then that didn't happen, and I am unaware of that process. It is back in its expected channel.  — billinghurst sDrewth 00:53, 13 December 2019 (UTC)