IP Editing: Privacy Enhancement and Abuse Mitigation/CheckUser Improvements

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

Overview[edit]

When the Anti-Harassment Tools team put together a proposal to enhance privacy for unregistered contributors on our projects, the biggest concern was how this will impact anti-vandal fighters’ ability to protect our wikis. In anticipation of that, our team has been planning its work on improving anti-vandalism tools available on our projects and also adding new ones.

As a first step towards these goals, the team has decided to undertake this project to make improvements to the CheckUser extension. In the numerous conversation threads on the project talk page and in our conversations at Wikimania, we repeatedly heard from stewards and checkusers about CheckUser being one of the biggest pain points that obstruct them from doing their duties effectively. Given how important this extension is to the anti-vandalism workflows of our projects, we have decided to prioritise making improvements to CheckUser as our team’s next project.

Background[edit]

CheckUser is a tool available to users who have the checkuser permission. Among other things, CheckUser can be used to:

  • Determine which IPs and user-agents are associated with a given user account
  • Determine the edits, logged actions and password resets of a specific IP

CheckUser was first created over 12 years ago and was preceded by another tool called Espionage (created in 2005) before that. In the many years since, while the number and scale of our wikis has gone up significantly, the CheckUser extension hasn’t changed very much. It’s no surprise that the biggest code contributor to CheckUser in the past 10+ years has been translatewiki bot, that adds translations periodically.

Through our research into checkuser workflows, we have determined that significant challenges exist in a few key stages of checkuser tool use. We have broken down use of the tool into five key stages:

  • Triaging - assessing cases to see whether or not CheckUser will be useful in this situation,
  • Profiling - looking at existing public data to compile patterns of rule-breaking or telltale patterns of behaviour,
  • Checking - Getting information out of CheckUser and comparing it against the public data,
  • Judging - deciding which accounts are likely to be socks, and if so, how likely,
  • Closing - documenting the case and publishing the public recommendations for action.

The CheckUser tool is mainly employed in the profiling, checking, and judging stages, and other software or tools are used to assist.

Problems with the existing tool[edit]

Here are some of the core issues that were brought up by the users we talked to in a series of user interviews:

  • The tool makes the user go through a series of repeated steps, showing small pieces of information in each search instead of showing all the necessary data in one interface.
  • For each CheckUser use, a new tab needs to be opened in order to keep a comparison between different users and IP addresses.
  • All the data from the various tabs needs to be kept in a person's memory while performing complex comparisons. This is especially difficult when there are way too many user accounts involved in an investigation.
  • The tool does not let one export any data, making it a tricky process to copy and paste strings for further use.
  • CheckUser tool has a high technical learning curve. For instance, one needs to have good working knowledge of how IP addresses and ranges work in order to determine if two user accounts are socks.
  • Using CheckUser is an extremely time consuming process.
  • The tool interface is extremely outdated - making it harder to accomplish relatively simple tasks.

Status[edit]

8 October, 2020[edit]

The new Checkuser tool is now available on all projects. It is accessible on Special:Investigate. We'd love your feedback!

26 September, 2020[edit]

Since we last enabled the feature on French Wikipedia, we have not heard much feedback from the community. This may be due to the fact that a large number of French Wikipedia editor community takes vacation time during the summer months. To gather more feedback, we have enabled the new special page on Spanish, Italian and Swedish wikis. Our next step is to collect as much feedback as we can before we roll out the feature on all projects. We are doing our due diligence to ensure no bugs slip through.

12 August, 2020[edit]

Special:Investigate has been launched on the French Wikipedia as a pilot. We are collecting feedback and will soon rollout to all projects. The team was working on reduced capacity the past few months which contributed to a delay in expected deployment schedules. Hopefully we will be able to roll out the feature to all projects soon.

Introduction to Special Investigate

We have an introduction video which walks you through the new tool.

7 May, 2020[edit]

Our technical work on CheckUser improvements is close to being wrapped up. The team is excited to start sharing our work on the new version of the tool with the community. Thanks to everyone's work the new Special page has been deployed on testwiki. Given the sensitive nature of the tool, we will be rolling it out gradually. We will ensure that the existing CheckUser tool continues to work as expected as we deploy the new tool. Our next steps on this project look like this:

  • Conduct a couple rounds of user testing with a limited set of users on testwiki to ensure it works as intended
  • Incorporate feedback from the user testing to make improvements to the tool
  • Open up the tool to potentially all checkusers
  • Pick a set of small-medium wikis to deploy the new version of checkuser
  • Gather feedback
  • Make improvements based on feedback and fix bugs
  • Deploy more widely

31 March 2020[edit]

Over the past few months the Anti-Harassment Tools team has been hard at work to provide a redesigned, easier, more efficient experience for CheckUser. We've been in constant contact with Checkusers and stewards who offered their time to help us perform testing on the work-in-progress features to ensure they fall in line with their expectations. Since CheckUser is a very sensitive tool, we've been asked to perform our testing internally as much as possible to avoid risk of having any major bugs when the new features are made available for the community. I am happy to report that the new Special page that we have been working on building has been enabled on testwiki for internal testing and will soon be made available for checkusers and stewards on the wikis. I appreciate your patience through this process. I will be providing more regular updates here as we start rollout of the new special page.

5 November 2019[edit]

Hi all! I’m here to share some of the early mocks we have been building for a better CheckUser experience. Based on all of our conversations on this project page and elsewhere, there were a few key improvements from the current tool we had in mind when we were thinking about the redesign:

  • Include a way to query for multiple users and IP addresses in one go
  • Include a way to get an account overview for a set of users in order to determine if a case is warrants a full-Checkuser check
  • Include a way to get an activity overview for a group of users to detect patterns of similar behavior

Based on this, we have come up with a set of mockups that include these improvements and also fulfill the existing use cases from the tool. I’m going to detail them below. I will note that there are a few things we are still working on integrating. This includes - Blocking, CLDR and Export of the data from the CheckUser interface. They are not part of these mocks but will be included in the final design.

Step 1: Input form[edit]

Lookup users - proposed CU redesign.png

This is a basic input form to accept a series of usernames, IP addresses or and IP ranges. There is a checkbox for including all other users behind the same IPs as these users in the display. The accounts identified using the same IPs would be added to the table of results and it would be indicated that they have been identified as using the same IPs as the users being looked up. This is going to help identify sleeper accounts which are currently difficult to identify without multiple lookups in Checkuser. If an IP address is entered in the input box, all users behind that IP address will be listed in the Compare tab of this design (similar to what Get users displays in CheckUser currently. Note that there is no Duration drop-down anymore as we feel it is not very helpful. If this is not true, please tell us on the talk page. Once a check has been run, the user will be presented with three tabs:

Step 2: Preliminary check tab[edit]

Preliminary check - proposed CU redesign.png

This tab presents an overview of the user accounts being looked up. This information is similar to what is presented in Special:CentralAuth.

Step 3: Compare tab[edit]

Compare tab - proposed CU redesign.png

This is the CheckUser step and is similar to the Get IP addresses tab in CheckUser currently, except for the fact that it runs for multiple users/IPs. We have including some advanced filtering features so you are able to narrow down the results to be helpful when there are too many results in the table. Note that this mockup does not show the WHOIS/rDNS links for the IPs but they will be a part of the final design.

Step 2: Timeline tab[edit]

Timeline view - proposed CU redesign.png

This tab is meant to provide a timeline of edits and log actions made by the set of users/IPs being looked up. This is similar to the Get edits tab in CheckUser currently. We also have filters and highlighting built into this view to enable you to quickly sort/filter through the results.

We are interested in getting your feedback on the above on the talk page. Most importantly, does this fulfill your use cases from the tool?
We also plan to do a round of early prototype testing, to get in-depth usability feedback. We hope to perform these tests regularly as we iterate on this design. If you're interested in participating, please email our Design Researcher CLo (WMF) (clo@wikimedia.org).

Thanks. -- NKohli (WMF) (talk) 21:07, 5 November 2019 (UTC)

4 October 2019[edit]

While there can be a wide range of changes we can do to improve the tool, we want to start with some key features that can make a lot of impact. Below is a list of suggested improvements that we think we can deliver quickly and reduce some of the biggest pain points of using the tool -

1. Add a way to accept one or more usernames in the tool to perform sockpuppet checks. 2. Add an option to do a “preliminary check” for the users. This will show basic information for all of the submitted usernames in one view such as - account creation timestamp, wikis the account is active on, number of edits on each of the wikis, link to global contributions and block log etc. All of this will be existing public data, displayed in an organized way to help checkusers come to a decision on whether they want to perform checkuser lookup on the accounts in question.

As an example, if a CheckUser wants to run a check against accounts User A, User B and User C:

Preliminary check
Username Account creation Activity Global contribs Anything else?
User A August 12, 11:00 Enwiki - 4 edits (contribs)🏡

Frwiki - 3 edits - blocked (contribs)

Commons - 9 edits (contribs)

Link to guc
User B August 12, 11:02 Enwiki - 1 edit (contribs)

Commons - 2 edits (contribs)🏡

Link to guc
User C August 12, 11:25 Eswiki - 14 edits (contribs)🏡 Link to guc

3. After the preliminary check, if the checkuser wants to run the checks against the accounts, they would be able to do so. On the next step, we could show all the associated private information with these accounts in one tabular view. One caveat is that this would only work on one wiki at a time. Global CheckUser is a potential future improvement we can work on but we want to focus on making core improvements first that we can deliver quickly. This isn’t meant to be all we can do on the anti-vandalism workflows, but where we start.To carry on with the above example, this could look like:

CheckUser
Username Activity IP address User agent Anything else?
User A August 12, 11:00 1.1.1.1 (ipcheck) - 4 edits (10 from all users) Chrome 65, Windows 10
August 12, 13:00 1.2.3.4 (ipcheck) - 2 edits Android HTC
August 13, 9:00 1.1.1.1 (ipcheck) - 3 edits (10 from all users) Firefox 9.1, Windows 10
User B August 12, 11:02 1.1.1.2 (ipcheck) - 12 edits (13 from all users) Chrome 65, Windows 10
User C August 12, 11:25 1.5.3.4 (ipcheck) - 4 edits Safari 8.5, iOS 13
August 12, 12:08 1.6.3.4 (ipcheck) - 5 edits Chrome 66, Windows 10
Linked accounts (below accounts were found associated with the IP addresses found above)
User D August 13, 16:00 1.1.1.2 (ipcheck) - 1 edit (13 from all users) Chrome 65, Windows 10
User E August 12, 08:00 1.1.1.1 (ipcheck) - 3 edits (10 from all users) Firefox 9.1, Windows 10

Information listed in the above tables is not a definitive list of what will be included. A lot of it depends on what is technically feasible. The tables above are only a suggestion of the way that the data that can be presented, and we are open to suggestions for how to best present this information. As we work on this feature, we will keep mobile usability or CheckUser in our minds.

We also really want to hear from you about what makes sense to show and what are we missing.

Here are a few questions we have from people who use or have used CheckUser in the past:

  • What do you think of the above proposed improvements?
    • What do you think of the preliminary check idea?
    • How else could the CheckUser information be presented?
    • What do you think of the information being displayed in the actual CheckUser step?
  • How can we improve the CheckUser logs to be more helpful with the above proposed improvements?
  • What are we missing?

We’d love to hear from you about this list of proposed improvements on the talk page or you could email me at nkohli@wikimedia.org if you would rather not discuss something publicly.

See also[edit]