Grants:IdeaLab/A Tor Onion Service for Wikipedia
What is the problem you're trying to solve?
What problem are you trying to solve by doing this project? This problem should be small enough that you expect it to be completely or mostly resolved by the end of this project. Remember to review the tutorial for tips on how to answer this question.
Users that want to read Wikipedia while protecting their privacy can already use the Tor Browser Bundle to read Wikipedia. However, an onion service would increase their privacy in the sense that user would never leave the Tor network.
Discussions about Wikipedia and Tor have returned cyclically since 2006, in particular with respect to the theme of allowing users to be able to edit Wikipedia via Tor. The main obstacle to that is the impossibility at the moment of finding ways to prevent abuse over Tor, in particular for sockpuppets. This proposal touches on that only tangentially as its objective is not to argue for broadening the existing policies for editing over Tor.
What is your solution?
For the problem you identified in the previous section, briefly describe your how you would like to address this problem. We recognize that there are many ways to solve a problem. We’d like to understand why you chose this particular solution, and why you think it is worth pursuing. Remember to review the tutorial for tips on how to answer this question.
The solution consists of building a proxy serving Wikipedia for reading and editing as an onion service as part of the Tor network.
What are your goals for this project? Your goals should describe the top two or three benefits that will come out of your project. These should be benefits to the Wikimedia projects or Wikimedia communities. They should not be benefits to you individually. Remember to review the tutorial for tips on how to answer this question.
- Give users a new way to access Wikipedia with higher guarantees for their privacy
- Support Tor as a privacy-supporting technology
Eventually, if this project is successful this proxy would be managed directly by the Wikimedia Foundation, so to eliminate any unnecessary third party that could act as a man-in-the-middle.
A Tor onion service (or hidden service) is a site a user can visit or a service that uses Tor technology to provide security and, if the owner wishes, anonymity to its users. Examples of hidden services are the secure messaging app ricochet or the Internet Archive onion proxy.
This proposal argues that having an onion service for Wikipedia would be potentially useful for readers with little impact for the current editing process.
At the moment, editing Wikipedia over Tor is blocked and, to my knowledge, there are no .onion services serving Wikipedia. Users wishing to edit Wikipedia using Tor need to ask for a special IP block exemption.
The idea is setting up a service that:
- proxies Wikipedia so that it can be read through the hidden service;
- using OAuth (which is enabled on Wikipedia) so that a registered user can edit Wikipedia over the service using her own nickname (only for users that have already received IP block exemption).
This proposal would not result in any change of the current policies because users wishing to edit would still need to require the IP block exemption. An additional interesting thing would be besetting up a procedure over Tor to request such an exemption. In this sense the No open proxies policy remains the same.
How an onion service benefits Wikipedia readers
- We know that Wikipedia - i.e. the Wikimedia Foundation's servers - have been subjected to mass surveillance by national agencies. A 4th Circuit court in the U.S. recently ruled unanimously that this activity provides standing to the Wikimedia Foundation in its lawsuit against the NSA.
- Providing the possibility to visit Wikipedia anonymously prevents the risks related to the possibility that the Wikimedia Foundation can be legally coerced into providing information about its visitors.
- Providing a proxy that, for the architecture of hidden services, can not be subjected to censorship by governments as in the recent case of Turkey.
- Providing a hidden service safeguards users against malicious Tor exit nodes. Given how Tor works exit nodes are in the position of eavesdropping communications to the destination website that the user wants to reach (in this case, Wikipedia). Research has shown that if a user browses multiple websites and leaks information about their identity over one of those sites then this information could be used by a malicious exit node for de-anonymize the traffic to other websites coming from the same user. In the case of Wikipedia, this risk is limited by the fact that Wikipedia is served over HTTPS with HTTP Strict Transport Security.
While the benefits described in points 1. and 2. above can be obtained visiting the wikipedia.org website using a Tor-enabled browser, a hidden service is necessary to prevent the risk associated with malicious exit nodes.
How a Wikipedia onion service benefits our mission in general
In general, it can be argued that making Wikipedia more widely available in general is well withing the mission of our movement. However, providing Wikipedia over Tor would promote awareness of Tor itself as a technology for protecting user privacy: making a widely-used website as Wikipedia accessible via an onion service would not only give an additional way for privacy-concerned users to browse Wikipedia, but it would also make the ominous "darknet" a little bit more similar to the "clearnet", i.e. Internet. This is essential for spreading the usage of Tor among the "average" user.
Can this service be exploited by malicious editors (vandals, spammers, sockpuppeteers)?
No, unregistered users will still be blocked and registered users (editing with their own nickname) will still need to apply for IP block exemption.
If the service becomes widely utilized it could obfuscate some of our analytics in the sense that we would simply see an increase of the traffic coming in Wikipedia from the Tor network.
Which challenges do we need to overcome for this project to succeed?
What software do we need?
We need software for proxying Wikipedia and providing OAuth authentication to users. Some solutions already exist and could be improved or adapted to this use case:
- MediaWiki-proxy on GitHub written by former WMF employee Chris Steipp;
iaproxy.phpby Hacker Factor, the user who set up the Internet Archive onion proxy;
What about setting the service up and maintaining it?
- we need the expertise to set up and maintain an onion service;
- a similar service set up for proxying the Internet Archive was attacked by various mis-behaved bots, so this service should be set up and configured to mitigate these attacks without affecting regular users. There are mitigation steps available here.
- DDoS against the onion/hidden service. This won't impact wikipedia, but it could make the service unavailable. There are mitigation steps available here.
- Constant upkeep. New versions of Tor are released often (every few months). This software would need to be kept up-to-date.
How does Tor work?
In very general terms a connection to a website on the Internet (also known as "the clearnet") over Tor, works as follows: 1. the Tor-enabled browser (e.g. the Tor Browser Bundle) request a "circuit" over Tor 2. The browser establishes an encrypted connection with the Tor network, the connection consists of three hops: an entry node (a "guard"), a "middle" node and an "exit" node. At each step a layer of encryption is removed so that no node has complete knowledge concerning the connection. Connections to websites and services and websites on the clearnet emerge only from exit nodes.
A service is set up on a server and connects to the Tor network, randomly picking some relays in the network and asks them to act as introduction points. Then, the hidden service assembles a hidden service descriptor, containing its public key and a summary of each introduction point. From this process a 16-digit string identifying the service is computed, this string followed by
.onion will be the address of the hidden service. The hidden service and introduction points are then advertised in a public shared database available through the Tor network.
When a user wants to contact a hidden service it sets up a rendezvous point with the service thought one of the introduction points. Finally, the user and the service connect at the rendezvous point withing the Tor network. Both the service and the user connect to the Tor network through Tor circuits (as described above) so that their real IPs are unknown to each other.
With the Tor Browser Bundle you can visit any website on the internet (also called the "clearnet") and onion/hidden services (called the "darknet", since these websites are reachable only via Tor).
A user connecting to the Tor network will be assigned a circuit, i.e. its connections to any website will be routed within the Tor network through multiple nodes (or "hops").
If you want to visit a website on the clearnet (say wikipedia.org) the browser connects to an entry node, the connection is then forwarded to a middle node and finally to an exit node. This connection is set up in a way that there are multiple layers of encryption that are "peeled off" when the connection is forwarded from one node to the next (hence the name Tor, i.e. The Onion Router)
Exit nodes are the most problematic to operate because they are the point from where the traffic emerges from the Tor network to contact the websites on the internet. They are the most problematic in two ways:
- if some users abuse the Tor network for nefarious purposes (e.g. they set up a bot inserting spam in the comments of a website, and change frequently circuit so that their IP address changes and can not be blocked) then the target website will see this traffic coming from the exit node and will mark it as a "malicious server".
- if the user visits a website over HTTP (not HTTPS) then a malicious exit node is able to snoop on the traffic going through itself towards the destination website. Even if the exit node does not know the original IP address of the user this traffic could contain sensitive information or other data that may lead to deanonymize a user.
An onion/hidden service is a website that is served only by the Tor network. When a user connects to an onion/hidden service this connection never leaves the Tor network.
About the idea creator
I am CristianCantoro, I am a Wikipedian since 2007, a Wikimedian since 2008 and I have served on the board of Wikimedia Italy since August 2010 to April 2017. I am also operating Tor relays since 2014 (two of them, one an exit node). In real life I am a PhD in computer science in Trento, Italy
- Advisor I created a similar Tor hidden service for the Internet Archive. The Tor network provides additional threats cannot be mitigated by IP blocking since Tor anonymizes the IP. I've spoken with Consonni: if Wikipedia approves this hidden service, then I'll happily port my current code to provide this service and help maintain it. Hackerfactor (talk) 12:14, 10 June 2017 (UTC)
- Volunteer I'm very interested in the Tor project, I have been reading things because in a future I would like to make a paper for my university,also I think this is a great opportunity to start and collaborate with Tor and Wikipedia 2806:104E:8:238:D00:AD70:A10D:7AF2 02:04, 24 November 2017 (UTC)
- Volunteer It's a very great ideas ! I use both Wikipedia and Tor. I have knowledges in informatics, maintly in front-end. Unfortunaltly I have less experience for back-end so I don't know if I can be helpful and how. But, have a good chance !! --Gratus (talk) 16:55, 24 November 2017 (UTC)
- Volunteer contribute the least i can afford to Darknetekona (talk) 04:59, 21 March 2022 (UTC)
- Yann (talk) 17:42, 5 June 2017 (UTC)
- ProtoplasmaKid (WM-MX) (talk) 17:58, 5 June 2017 (UTC)
- petrohs (gracias) 18:05, 5 June 2017 (UTC)
- Doc James (talk · contribs · email) 05:13, 6 June 2017 (UTC)
- Marcok (talk) 05:37, 6 June 2017 (UTC)
- Pundit (talk) 14:27, 6 June 2017 (UTC)
- Ne0Freedom (talk) 09:07, 8 June 2017 (UTC)
- 陈少举 (talk) 11:38, 8 June 2017 (UTC)
- Atomotic (talk) 15:13, 8 June 2017 (UTC)
- Yawnbox (talk) 04:56, 9 June 2017 (UTC)
- Josve05a (talk) 05:23, 10 June 2017 (UTC)
- Sahaquiel9102 (talk) 23:59, 12 June 2017 (UTC)
- Salvador (talk) 15:29, 19 June 2017 (UTC)
- OwenBlacker (Talk) 21:55, 23 November 2017 (UTC)
- ZoRrO0880 (talk) 02:29, 24 November 2017 (UTC)
- --Alaa :)..! 18:55, 4 December 2017 (UTC)
- --Donald Trung (Talk 🤳🏻) (My global lock 🔒) (My global unlock 🔓) 10:05, 6 December 2017 (UTC)
- it will be easier to access wikipedia in countries where it is censored 18.104.22.168 18:05, 29 March 2022 (UTC)
Expand your idea
Would a grant from the Wikimedia Foundation help make your idea happen? You can expand this idea into a grant proposal.
No funding needed?
Does your idea not require funding, but you're not sure about what to do next? Not sure how to start a proposal on your local project that needs consensus? Contact Chris Schilling on-wiki at I JethroBT (WMF) (talk · contribs) or via e-mail at cschillingwikimedia.org for help!
- ↑ This bug on Phabricator has some references to other discussions
- ↑ «Tor developers use the terms "hidden services" and "onion services" interchangeably», however some argue that the term "hidden service" sounds ominous and prefer to use the term "onion service"
- ↑ From «Nine Questions About Hidden Services»
- ↑ whose address is accesible over Tor at archivecrfip2lpi.onion, see this blog post for more information.
- ↑ Post on the Wikimedia blog: «Victory at the Fourth Circuit: Court of Appeals allows Wikimedia Foundation v. NSA to proceed».
- ↑ "Compromising Tor Anonymity Exploiting P2P Information Leakage" and a blog post on Tor's blog.
- ↑ This is a concern is also cited in the blog post presenting the Internet Archive proxy.
- ↑ As document in the «Tor Users» page there are many reasons for which "normal users" should use Tor for browsing.
- ↑ «Attacked Over Tor»