Talk:IP Editing: Privacy Enhancement and Abuse Mitigation/Archives/2019-07

From Meta, a Wikimedia project coordination wiki

Brief thoughts

Making anonymous users actually anonymous is a good idea, but will definitely have an impact on our anti-abuse workflow regardless of how it is rolled out. Don't be afraid to develop and test models, so we can see what actual impact proposed solutions will have rather than endlessly opining about hypotheticals. – Ajraddatz (talk) 23:15, 31 July 2019 (UTC)

While I'm nearby, some other thoughts: with how dynamic most IPs are, particularly those associated with mobile device connections, CU and anti-abuse tools are far more unreliable than they have been in the past. Cookie blocks were introduced some time ago to help address this problem, but it would be very, very useful to modernize our anti-abuse capacity as part of this project. Stewards and CUs need more data points to be able to work effectively, and giving us access to things like email address or cookies (or, if not the actual data itself, then a representation that would allow us to compare between users / anons) would be incredibly useful if anti-abuse tools were also built around that. – Ajraddatz (talk) 23:19, 31 July 2019 (UTC)
Butting into your thoughts I disagree on the cookie block bit. Cookie blocks are pretty useless and anyone who is over 10 or under 90 knows how to clear cookies or just use incognito mode on your phone. It's why the Washington Post has shifted away from cookies when determining your article views: everyone was just reading WaPo articles in incognito mode so they could get unlimited articles a month for free. Giving us email access among other things would be helpful as well, but I'm skeptical of cookies being that useful. TonyBallioni (talk) 23:26, 31 July 2019 (UTC)
Yes, they aren't super useful, but it's one more data point that we don't have. I'm also not very technically-minded, but I know that other companies are using more intrusive methods to monitor who is accessing their site, and we should consider whatever they are doing so long as we handle that data responsibly. – Ajraddatz (talk) 23:34, 31 July 2019 (UTC)
Yeah, I agree with you. The current CU tool isn't exactly great and building pretty much any enhancement there would be useful. Any additional intrusion would likely be balanced by the fact that it'd be significantly more difficult for us to release any private information. TonyBallioni (talk) 23:45, 31 July 2019 (UTC)

Need of tools for abuse mitigation

Besides the increment of harassment and vandalism that such a change would yield (as written above), there are needs to mitigate attacks that are basic for the good health of the projects, and their participants. The two things must go together, otherwise it seems that WMF wants to open the doors to vandalism and harassment, which will bring strong reactions from the different projects.

To fight vandalism and harassment from LTAs (malevolent users that keep coming back), keeping IPs information more time is a valuable resource (more than the canonical 3 months) - even if we limit their access to few trusted users, in order to ensure that we can trace serious abusers identifying abuses (thinking about like child pornography or defamation). It would also be useful to have User Agent and IP-oriented filters. It seems to go in the opposite direction of enhancing privacy for the IPs, but not necessarily. CUs and sysops should be still able to check the IP address behind an edit, like every other website manager, or to block IP ranges. This can be masked, but an unique string must be permanently attached to each IP. Tools should also allow to do stuff like: "block all the IPs from the /16 range of the IP behind that edit".

Instead of having sysops and CUs accessing IPs, the alternative would be delegating the access to tools, so that they can filter basing on the devices, the editing style, and the IP range of a malevolent user. If this is not possible, I fear that we must continue using good old human intelligence. Ruthven (msg) 07:26, 1 August 2019 (UTC)

  • Yeah, if the current tools stay the same, but we get pseudo usernames instead of IPs, it's going to be unfortunate for projects that are too small for their own checkusers. You're essentially just taking away the ability to range block in its entirety. GMGtalk 17:53, 1 August 2019 (UTC)

Another option (and the only sensible one) is making those new automatically generated names MAC-address based UUIDs. Some imgboards (like 4chan) use them so you can see if some person with the same IP is trying pose as multiple users (the first part of the user ID will be similar with the same IP). --Pudeo (talk) 18:17, 1 August 2019 (UTC)

@Ruthven: I wholeheartedly agree that it is important to make sure we make absolutely sure that any work done here does not negatively impact the health of our projects. Trust me, we do not want to open doors to vandalism or harassment - but the very opposite. We are also going to be taking this project as an opportunity to explore improvements we can make to our existing anti-vandalism tools (CheckUser is high on the list). There are some great ideas on this talk page for that - exactly the kind of feedback we are looking for. Thank you for your comments. I appreciate it. -- NKohli (WMF) (talk) 19:39, 1 August 2019 (UTC)

Musings

Generally a good idea, especially from a CU perspective as this would make accidental or even intentional breaches of the privacy policy much more difficult.

The difficulty will be from non-CUs who of course make up the overwhelming majority of users who provide anti-abuse volunteer efforts. This would make range blocks effectively only available to CUs (if at all) and there needs to be a lot of thought given on how we can mitigate the difficulties this will bring. I'd be happy to talk to anyone offline if that's easier (or here). Just shoot me an email or ping me. TonyBallioni (talk) 23:22, 31 July 2019 (UTC)

  • Adding to this, I'd recommend that you extend the scope of your research (WMF) to focus on admins and non-admin vandalfighters rather than CUs and stewards. Those are the people that will be hit hardest by this, as they do the bulk of the anti-abuse work and won't have access to the tools needed to access data once the public-facing info is more anonymized. – Ajraddatz (talk) 23:36, 31 July 2019 (UTC)
Agreed. I could not imagine how hard it would be to manage ranges and anti-vandalism/LTA work without being able to see the IP's. Vermont (talk) 23:39, 31 July 2019 (UTC)
I think that's an overreaction (Vermont, not ajr). I agree with Ajraddatz's point that it likely won't impact CUs and stewards too much (other than making the CU stats go through the roof) but there are likely technical means to address your concerns, and if you express them here, the WMF can help take them into account. TonyBallioni (talk) 23:45, 31 July 2019 (UTC)
I'd have complications identifying LTA's without being able to WHOIS IP's, although range blocks would still be possible if we still made it able to put IP ranges in Special:Contribs without showing the actual IP's. (Ex: Anonymous user XXXXX/24) Vermont (talk) 02:10, 1 August 2019 (UTC)
IMO blocking "Anonymous user XXXXX/24" would be something of a shotgun blast in the dark, since you have no idea who else that might affect or whether there is any reason to think all 255 other names are actually related. Anomie (talk) 21:19, 1 August 2019 (UTC)
My biggest concern is dealing with spam. There is a lot of xwiki IP spam that flies under the radar, under the guise of references (see the SW wiki debacle) and not being able to search ranges for these types of edits will really hamper the ability for folks like me to combat this. Praxidicae (talk) 23:42, 31 July 2019 (UTC)
Yep. What's deeply concerning to me is that often the fact that even knowing that something was going on with a range starts with a non-CU or non-Admin noticing peculiar behavior from similar IPs. Unless we expect CUs or some new CU-lite user group to just routinely range-check every remotely suspicious edit, a lot more of this is going to slip by than already does. I would rather the Foundation focus on improve our ability to deal with dynamic IPs, which is deficient enough already. Specifically we need a much easier way to communicate with IP ranges, like being able to create a talk page for X:X:X:X::0/64. We need to be able to search deleted contributions of an IP range. And it would be helpful if we could look at a list of revisions, whether from a history page or somewhere else, and see the range of each editor instead of individual IPs (I'm aware that this is hard because the smallest meaningful subnet is different for different ISPs). This would be helpful not only for catching vandalism but also for preventing overreactions. I have recently noted on enwiki that several individuals on different subnets of the same network can appear to be one person able to hop across the entire network. It's actually a huge pain sometimes to sort out how many subnets are involved and how to block without collateral, and admins sometimes get it wrong. Someguy1221 (talk) 02:27, 1 August 2019 (UTC)
Someguy's concern is more or less mine. In quite a few cases, starting toward noticing that something odd was going on just started with noticing a pattern in the vandals I was seeing pop up. That can help identify everything from proxy abuse to dynamic IP hoppers. Additionally, seeing the IP and being able to WHOIS it allows placement of specific block notices, such as the "school block" templates that are especially designed for when one is blocking school IPs and large numbers of students may be affected. Without being able to see IPs, admins will have no idea that anonymous abuse is coming from a range for which a range block should even be considered, and checkusers cannot look at every instance of run-of-the-mill vandalism or spamming to find out. This would severely impact our ability to address abuse from those who know how to IP hop. Seraphimblade (talk) 04:50, 1 August 2019 (UTC)
  • As a "non-admin user", I think this is a very bad idea. You will not be able to see which vandalisms have been accomplished by the same IP. Identifying sock-puppets will be a lot more difficult. Right now, geolocation of suspicious IP helps a lot. You will not be able to see which edits are coming from a proxy server. And remember that the vast majority of users are not admins. Given the current scope of vandalism and sockpuppetry, I think this "enhancement" will be a disaster. My very best wishes (talk) 18:25, 1 August 2019 (UTC)