User:Ladsgroup/On Privacy in Wikipedia in the age of AI
|(English) This is an essay. It expresses the opinions and ideas of some Wikimedians but may not have wide support. This is not policy on Meta, but it may be a policy or guideline on other Wikimedia projects. Feel free to update this page as needed, or use the discussion page to propose major changes.|
When you edit Wikipedia, it will be public. We all know that. But do you know what actually it entails?
How can you find my sockpuppets?
Some trusted users called Checkusers are able to see your IP address and UA. Meaning they will know where do you live, maybe where are you studying or where you work. They don't disclose such information and it's subject to a really strong policy. But:
The way you use language is unique to you, it's like a fingerprint. There are bodies of research on that. With simple natural language processing tools, you can extract discussions of Wikipedia and link accounts that have similar linguistic fingerprints.
What does this mean? It means people will be able to find, guess or confirm their suspicions on sock puppets you have. They will be able to link between multiple accounts without needing access to private data that could reveal where you live or work.
Who can analyze my edits?
That means anyone with resources or knowledge can analyze data trends in your edit history, such as when you edit, what words you use, what articles you have edited. As technology has advanced, tools for analyzing trends in user data have as well, and include things as basic as edit counters, and as complex as anti-abuse machine learning systems, such as ORES and some anti-vandal bots. Academics have begun utilizing public data to develop models for combatting abuse on Wikipedia using machine learning and artificial intelligence systems, and volunteer developers have created systems that utilize natural language processing in order to help identify malicious actors.
As with anything, these technologies can be abused. That’s one of the risks of an open project: oppressive government or a big company can invest in it and download Wikimedia Dumps. They can even go further and cross check it with social media posts. While not likely in most cases, in areas of the world where free speech is limited, one should be conscious of what information you share on Wikipedia and other Wikimedia projects. As tec
Beside external entities, volunteers have been building such tools to help Checkusers do their job better, with the potential to limit access to private data. The exampled tool here, is being used in several wikis already but it's only made available to checkusers of that wiki by the developer. The tool wouldn't give you just a number, it'll build plots to make decision making easier.
Can we ban using AI tools?