Community Wishlist Survey 2017/Archive/Allow disabling referrer info

From Meta, a Wikimedia project coordination wiki

Allow disabling referrer info

  • Problem: For those unfamiliar with HTTP Referral Headers (or w:HTTP referer), an HTTP referrer is some part of developer tool that helps an external website trace an info of the transferring website that a user was transferring from. For example, if a user goes to w:en:White House article and then clicks a link to an official website (https://www.whitehouse.gov/), that (official) external website would collect a datum saying that a user was transferred from that previous website. The WMF's current referrer policy is "origin-when-cross-origin" (phab:T87276), meaning that an external website, like the White House website, would read just the full domain, like "https://en.wikipedia.org/".

    However, some (if not many) Wikimedian users feel not secure enough by the current policy whenever they click links to external websites, especially via browsers that support referrer policies. Browsers would still give referrer info to external websites. Therefore, this year at the en.WP village pump discussion, the majority favored having their referrer info muted out, i.e. "no-referrer", which would conceal the header info from an external website, especially when the website wants to trace the info of the transferring/origin website (i.e. where the user was transferred from). However, one of the WMF staff says that there are no plans to change the referrer policy anytime soon out of concerns over impact on "technical efforts" or aggregated/non-personal statistical data (or something like that).

  • Who would benefit:

    Logged-in editors who are using browsers that (partially or fully) support referrer policies but do not have add-ons/extensions optimized for those browsers to suppress referrer info (i.e. referrer control), e.g. MS Edge;

    those logged-in editors who do not want external websites to find out their activities in any Wikimedia project (like Wikipedia), even when using a browser that supports a referrer control add-on, like Google Chrome, and even when only the full domain is reveal while using a browser that supports WMF's current referrer policy;

    logged-in users who would want to use a public computer that does not have extensions modifying/suppressing referrer info

    (if multiple-option solution is possible) those logged-in editors who would prefer to reveal their full URL to external websites (due to IE11's lack of support for referrer policies, the IE11 already reveals full URL to "https" external websites).

  • Proposed solution: If a multiple-option solution is impossible, how about a gadget instead as part of user preferences settings to opt-in/opt-out the referrer info? If a user is using a browser that supports referrer policy, a user would select an option that should support "no-referrer" and/or "never", depending on various browsers.

    On the other hand, if a multiple-option solution is possible, how about a user preference setting that allows an individual user to select any type of referrer policy for browsers that support either referrer policy (newer or older)

  • More comments: I originally thought I fully understood which browsers supported referrer policies. However, re-reviewing the proposal and comments, I realize that I should have known and thought more. Still, I think it's disheartening that MSIE11 doesn't support referrer policies, including either no-referrer or never. Also, MS Edge and Apple Safari support just an older draft of referrer policy, but that policy would not help those browsers recognize the WMF's current referrer policy. Therefore, during the Proposal phase, a Phabricator task was filed.

    Meanwhile, a user script to suppress referrer info was provided (under #Discussion section). I'm not sure how it helps whole communities, but it might be a good use in case that this proposal doesn't pass. Especially when the proposal passes, users should be notified that all versions(?) of Internet Explorer do not support referrer policies at this time, especially the "no-referrer" or "never" policy.

    Note that this proposal concerns just the logged-in editors. Unless I'm proven wrong, matters regarding referrer info of logged-out users are out of the scope of the Wishlist and can be discussed at other appropriate venues. Even when the matters are within the scope, any proposal regarding referrer info of logged-out users would not receive huge support, and the WMF would have repeatedly ruled the change of referrer policy.

    Referrer controls add-ons for browsers supporting them have been suggested as alternatives to this proposal. There have been reviews toward those add-ons, mostly good (but some bad). However, they are for users who don't want their Internet activities to be gathered by any website. What Internet users do in general is too broad for and not part of this proposal. Re-edit: I downloaded the extensions and tested them out. See details at the #Discussion section.

    Moreover, the default browsers has historically been either Microsoft Internet Explorer, Microsoft Edge, or Apple Safari, depending on which operating system (OS) you installed, like w:Microsoft Windows or w:macOS. If a browser that you have as part of the OS package doesn't support a such add-on, you must via the browser download a browser that supports such extensions.

    The WMF staff members can explain better than I what referrer info is and how referrer policies are made and which browsers support a referrer policy.

Old version of the proposal

Collapsing messy proposal (click to expand or collapse)
* Problem: HTTP referrer info is detected by simply clicking an external link to a website, risking privacy compromise and spamming which would help external websites (i.e. "https" websites as WMF's current policy has referrer info suppressed from "http" websites) detect where a user was clicking a link from. Even suppressing the https→http referrer info does not make a Wikimedian user feel more secure. Rather, if a user reads a Wikipedia article like the w:White House or w:Central Intelligence Agency and then clicks an external link to an external website, like an official White House website or an official CIA website, another https website would still track a Wikimedian user's "origin"(?) activities, putting a Wikimedian at risk of being tracked. The majority at this year's enWP RFC discussion favored the "silent referrer" option (i.e. never sending domain info to other website), while some others favored sending either the full domain option, like "en.wikipedia.org" (actually, the current status quo is "origin-when-cross-origin" per phab:T87276), or full URL referrer option. However, one of the WMF staff says that there are no plans to change the policy about referrer info anytime soon out of concerns over impact on "technical efforts".
  • Who would benefit: Users whose browsers (such as Microsoft Internet Explorer and Microsoft Edge) don't disable referrer info and/or don't support add-ons to disable referrer info, those who want to reveal info to external websites (i.e. if they want to enable the referrer info), and others other Wikimedian users.
  • Proposed solution: Instead of proposing a policy change, I would like to propose two solutions for users' preferences settings.
    1. Primary solution: multiple option referrer info can offer silent referrer info, full URL, full domain, and partial domain (e.g. "wikipedia.org") and maybe more (according to this site) as options for a user to choose in the Preferences settings.
    2. Alternative solution: a gadget to simply disable/opt-out referrer info by (probably) clicking a box in the settings.
  • More comments: The HTTP referrer is not easy to describe in simple words, and I'm not sure whether the general public knows what "HTTP referrer" is. I was told that browsers (if not websites, including external ones) can still detect IP addresses, regardless of how referrer info is sent.

Discussion[edit]

@George Ho: Rephrasing the summary for clarity is welcome. "Referrer info" already exists, so you probably don't ask for it to be added/supported. :) "Allow disabling referrer info" or something like that? --AKlapper (WMF) (talk) 12:27, 7 November 2017 (UTC)[reply]

Renamed the title, Andre. BTW, thanks for the clarity suggestion, but I don't know which part of summary to re-clarify. --George Ho (talk) 16:19, 7 November 2017 (UTC)[reply]
"Allow disabling referrer info" is already much clearer, thanks a lot! :) --AKlapper (WMF) (talk) 16:24, 7 November 2017 (UTC)[reply]

I'll caution commenters that the linked enwiki discussion is very large, disorganized, and full of errors, misinformation, and FUD. A summary of the salient facts:

  • By default, when following a link browsers may include "referrer" metadata in the request. Many sites use this data for various analytics purposes, e.g. to identify external sites that drive traffic to inform decisions about partnerships.
    • For links from sites served over HTTP, this data consists of the full URL of the page containing the link.
    • For links from sites served over HTTPS to HTTPS targets, this data consists of the full URL of the page containing the link.
    • For links from sites served over HTTPS to non-HTTPS targets, no referrer data is sent. This is sometimes called "dark traffic".
  • When Wikimedia wikis were changed to HTTPS-only in 2013, this caused all traffic from our wikis to non-HTTPS sites to become "dark traffic". In some cases this is a significant amount of those sites' traffic.
  • Web standards provide a solution: a site can instruct the browser on which of certain predefined options it would like the browser to use when sending referrer data. Most browsers honor these requests. Popular browsers such as Firefox and Chrome have addons that allow users to instruct the browser to ignore website requests.
    • In 2015, Wikimedia wikis began using this mechanism to request that browsers send only the domain of the page (the "origin", in technical web-standards terminology) as the referrer data in all cases. This was seen as a good compromise between privacy and visibility for the global Internet ecosystem. The Wikimedia Foundation's Security and Legal teams both approved this decision.
  • Then we come to the enwiki discussion. It was driven by a heavily one-sided presentation, inaccurate technical information (e.g. that Wikimedia itself actively sends this referrer data, that some options were even possible, the actual current behavior with respect to HTTPS destinations, that spammers have no workaround for the withholding of referrer data), presentation of options that are more complex and worse for privacy as if they were better for privacy (e.g. collecting data that would be vulnerable to subpoena and data retention orders), irrelevancies, and heavy bludgeoning with hypothetical situations where people affected would be well advised to use other solutions to apply to all their web browsing (e.g. the above-mentioned addons) rather than relying on one website (Wikipedia) to do the "right" thing.

This proposal itself begins with some similar issues, including suggesting impossible options ("partial domain") and irrelevancies (IP addresses, phab:T172009, and many of the linked search results). HTH. Anomie (talk) 14:31, 7 November 2017 (UTC)[reply]

Struck out the irrelevancies that you mentioned, Anomie. --George Ho (talk) 16:19, 7 November 2017 (UTC)[reply]

Anomie and AKlapper, if the multiple-options solution is impossible, I will soon strikethrough the "primary solution" and revise the proposal to just a gadget to enable/disable referrer info. Sounds good? --George Ho (talk) 23:57, 7 November 2017 (UTC)[reply]

I haven't looked at technical feasibility for offering a user preference to make it configurable. The impossibility I referred to is the fact that there is nothing at https://www.w3.org/TR/referrer-policy/#referrer-policies that allows for "partial domain". The options available are:
  • No referrer.[1][2]
  • No referrer for HTTPS→HTTP, full URL for HTTPS→HTTPS.[3]
  • Full domain (e.g. "en.wikipedia.org").[4][5]
  • No referrer for HTTPS→HTTP, full domain (e.g. "en.wikipedia.org") for HTTPS→HTTPS.[6][7]
  • Always send the full URL.[8]
In all cases it's possible to make an exception, sending the full URL for local links (e.g. links from one page on enwiki to another page on enwiki). The current setting on Wikimedia wikis uses this exception, FYI. Anomie (talk) 15:09, 8 November 2017 (UTC)[reply]
Modified the "primary solution" proposal. May I leave in struck text, or must I remove it? --George Ho (talk) 19:06, 8 November 2017 (UTC)[reply]

(Note that the "one of the WMF staff" mentioned above is the CTO, so that answer is as authoritative as it gets.)

It seems fairly pointless to provide something like that as a user option. Installing the appropriate browser extension is not any harder then enabling the appropriate user, provides real privacy (why would you want to suppress referrers for Wikipedia but nothing else?), does not require you to be logged in to Wikipedia, does not require splitting the cache (would not be a problem right now but eventually we want to serve cached pages to logged-in users as well) and does not consume WMF development resources that can be put to better use. --Tgr (WMF) (talk) 00:41, 19 November 2017 (UTC)[reply]

I sincerely disagree, Gerzo. Installing an add-on for a browser is not a valid reason to be reluctant to create a user option to suppress referrer info. Moreover, offering such a user option is not pointless, even when add-ons suppressing referrer info are provided for browsers and even when browsers offer settings to enable/disable referrer info, like Opera browser. Installing an add-on/extension is the concern for users who browse multiple websites via Internet. However, what other websites do to users does not and should not affect this proposal. If it does and should, I would have agreed with your arguments.

Also, we should not be like other websites that do not allow disabling referrer info, especially ones that don't offer user options to suppress referrer info. True, the user option is for Wikimedia projects, like Wikipedia, Wiktionary, and Wikidata, not non-Wikimedia websites; however, we shouldn't be too dependent on add-ons or Opera, should we? True, logged-out users would still be tracked by going to "https" external websites; moreover, they would not obtain this option. Also, they can download referrer controls for their own browsers. However, this proposal does not concern logged-out users. If it were, the proposal would have been totally different and out of the scope of the Tech Community because it would require policy change (right?), which the WMF's Chief Technology Officer (CTO) ruled out.

Also, this proposal is about allowing logged-in users to modify/disable referrer info, not about using add-ons for browsers during web surfing. Why else denying logged-in users something that would benefit them besides "consum[ing] WMF development resources"? (I don't know what you mean "better use", but I have heard complaints about VisualEditor and MediaViewer. Are still they examples of "better use"?) Furthermore, this proposal (I hope) can apply to other projects, like Wiktionary, if that's possible. There are reasons to create a user option: 1) many people in the enWP discussion chose "silent referrer info", even when the CTO says that policy change is unneeded; 2) this proposal would benefit users who would be reluctant to download a non-Microsoft browser; 3) the proposal would benefit users who do not want their activities in Wikimedia projects to be tracked. If there is no reason to create a user option, I would not have proposed the option in the first place. I propose it, and I believe that this proposal would work unless the majority would oppose it.

Speaking of add-ons, as said before, Internet Explorer and MS Edge neither support such add-ons to suppress the info nor have browser settings to disable the referrer info. Google Chrome, Opera, and Mozilla Firefox can suppress referrer info via add-on or setting, but a user must download either of the browsers via initially installed browser, like MS Edge. Also, having more than one browser would eat up hard disk memory more. BTW, I found a referrer control add-on for the old version of Apple Safari, not the latest version of it. I am unsure whether another add-on, created almost a decade ago, supports the latest version. –George Ho (talk) 08:44, 19 November 2017 (UTC)[reply]

IE does not support referrer policies at all. Edge and Safari only partially supports them (we probably should do something about that; filed T180921). The other browsers all have plugins. (And if you care about privacy you should use a browser which cares about privacy. We are not helping anyone by giving them a false pretense of privacy when real options exist.) Anyway if you feel strongly about weakening your own privacy and really want a user script for this it's as simple as $(function() { $('head').prepend('<meta name="referrer" content="no-referrer" /><meta name="referrer" content="never" />'); }); --Tgr (WMF) (talk) 22:13, 19 November 2017 (UTC)[reply]
Re-reading the proposal, I see some misleading statements, so I struck out some other statements and added some underlined ones. Therefore, this proposal should clearly concern the privacy of logged-in Wikimedians. I appreciate your giving out the user script that can suppress the referrer info. However, would that risk creating a lot of pages of user scripts? If this proposal is turned down, what if 50 or 500 or 5000 users want to use the above script into their own user script pages, leading to 50 or 500 or 5000 pages of them? Would that create a lot of bytes of data within the Wikimedia servers? –George Ho (talk) 01:33, 20 November 2017 (UTC)[reply]
The enwiki DB is something like 100GB currently. I wouldn't be overly worried about that extra 0.001GB or so.
I feel the description of this task is getting more and more confusing and FUDdish. Some examples of hypothetical harm (e.g. how will I be negatively affected when I click on the White House link and the White House learns I visited them from Wikipedia?) would be more helpful. --Tgr (WMF) (talk) 03:33, 20 November 2017 (UTC)[reply]
Can you point out which part of the description is more confusing and FUDish, so I'll modify the info more? Thanks.

Anyway, the CIA's privacy policy and the White House's explain that they collect data of activities. I read how the White House shares info and found out that normally it doesn't share such info to others. However, access is restricted to selected "employees, contractors, and vendors", and info may be shared with other government agencies per request by law enforcement or in case that the security of the official website is threatened. However, I could not find how CIA shares automatically gathered info of visitors who surf from other websites, but I found out why CIA gathers the info. I found another section saying, "Anyone using this Web site expressly consents to such monitoring." While the White House is discreet on disclosing or not disclosing any info, (to me unless I'm wrong) the CIA doesn't tell a user how the CIA shares info about activities of users who visit its official website. Also, the CIA doesn't mention "referrer" or "referer" anywhere in the Site Policies page.

If the two are not enough, how about surfing from the w:National Rifle Association to its official website? Somehow, this revision still contains the "http" URL link; however, the link redirects from "http" address to the "https" address. The NRA's Privacy Policy says that it gathers info from websites that refer to that website, but it doesn't say how the NRA shares the info. The CIA and NRA would find out speculate/assume that, even when only the full domain is provided as a header, users were transferred from Wikipedia articles would refer to their own respective official websites, and either organization would or would not share any referrer info with others at discretion, though their websites do not say how info is shared. Speculations about sharing info would arise, but they are mere speculations, right?

Nevertheless, how does https→http (redirect)→https referrer info work if https→http referrer info is concealed from unsecured "http" websites? --George Ho (talk) 04:30, 20 November 2017 (UTC); edited, 07:00, 20 November 2017 (UTC)[reply]

--- If you want speculated harm, I would say that CIA or NRA would use referrer info of originated websites that refer to their own official websites. I don't know what either organization would do to Wikipedia articles, and I don't know which part of info from those articles either organization would use. Other examples are clicking from w:Democratic Party (United States) to its official website. The DNC's privacy policy reveals whether to treat info as either personal or "aggregate only" and how they share any personal info to selected parties. The DNC says, "Nothing herein restricts the sharing of aggregated or anonymized information, which may be shared with third parties without your consent." On the other hand, the GOP's says that it treats referrer info as part of "aggregated data" and as anonymous data, but the GOP's website also relies on Google Analytics. Probably those organizations would promote their own goals to Wikimedians. Hmm... at least PFLAG doesn't collect referrer info unless for special(?) purposes. --George Ho (talk) 04:51, 20 November 2017 (UTC); edited, 04:53, 20 November 2017 (UTC)[reply]
--- How does Google share such info of Wikimedian users who click one of links, like Google Books, used as a source to cite an info? If a referrer info is non-"personally identifiable", then it can be shared publicly. The 2012-2015 version said that aggregated data may have been publicly shared. One of 2010 versions doesn't mention(?) how referrer info was shared; let's compare other past privacy policies. --George Ho (talk) 05:37, 20 November 2017 (UTC)[reply]
--- I saw some other websites saying that they treat referrer info as part of aggregated data. --George Ho (talk) 05:39, 20 November 2017 (UTC)[reply]

Gerzo, I re-read one of your comments and the website that you gave out and was previously mentioned in the en.WP discussion, and I realized that MS IE11 doesn't support referrer policies. Therefore, I decided to collapse the original (but messy) descrption of the proposal. I'll re-expand the newer description soon. --George Ho (talk) 08:44, 20 November 2017 (UTC)[reply]

Regarding the https→http traffic data, I tested websites with developer tools on Google Chrome and MS IE11. IE11 does conceal "https" referrer info away from "http" websites. On the contrary, according to the browser developer tools, Google Chrome gives referrer info to any external website, either https or http. --George Ho (talk) 12:02, 20 November 2017 (UTC)[reply]

@George Ho: Remember when I said the enwiki proposal suffered from bludgeoning of the discussion and irrelevancies? It seems you're starting to do the same thing now in this discussion section, with all this digression into the posted privacy policies of various random websites.

I also fail to see why (as stated in your new proposal) someone would be concerned about websites seeing "https://en.wikipedia.org/" as referrer info but wouldn't be concerned about websites seeing referrer info from any other website, or why someone worried about referrer info would insist on using a browser that does not give them the ability to suppress it globally merely because it happens to be the default browser included with their OS. You seem to be asserting both of these points as givens. Anomie (talk) 14:51, 20 November 2017 (UTC)[reply]

At first I was reluctant to download the extensions. However, I guess privacy policies are not convincing enough. Therefore, I downloaded Keepa.com's referrer control extension. I made filters. However, the filters didn't help block a referrer info sent to The Guardian. The Guardian site still retrieves the info. They also didn't help while clicking from one article to ICIJ. Nonetheless, it can block any referrer info sent from one Wikimedia/Wikipedia page to another Wikimedia/Wikipedia page. I'll do some most tests on this extension and other extensions soon. --George Ho (talk) 18:26, 20 November 2017 (UTC)[reply]
--- Wait... After a few more tests, the filters do work. At first, I thought I have to insert site filters individually. However, I realized that the extension has settings to block referrer info for all sites. However, I have to download the extension for an individual computer, like a private PC. I haven't seen yet public computers filtering out referrer info. If public computers in some or many places (e.g. libraries and cafes) don't use any extension, referrer info can still be transferred. Must I try a public computer? --George Ho (talk) 18:37, 20 November 2017 (UTC)[reply]
--- I tested ScriptSafe and find it also useful, but the settings say that selecting an option to suppress referrer info in all domains would lead to issues. I am unsure what the Japanese extension does, but the language is Japanese, which I do not understand. Therefore, I did not download and test it. --George Ho (talk) 19:14, 20 November 2017 (UTC)[reply]

Basically as I understand the claim of the current proposal is:

  • Some Wikimedians feel unsafe because of the White House learning that they have arrived from an (unspecified) Wikipedia page when they visit its pages.
  • Said Wikimedians do not feel unsafe about the White House learning their IP address, browser, operating system etc. (if they did, they would be using Tor or other privacy-protection technologies and the issue would be moot); it's specifically the referrer that's making them feel unsafe.
  • Said Wikimedians do not feel unsafe about the White House learning how they arrived (possibly including the full URL) if they visit its pages through any other website (if they did, they would use some browser setting/plugin to disable referrers on all domains and the issue would be moot); it's specifically visiting via Wikimedia sites that's making them feel unsafe.

To be quite frank I have serious difficulty believing this. --Tgr (WMF) (talk) 01:00, 21 November 2017 (UTC)[reply]

George Ho: Victoria Coleman, the WMF's Chief Technology Officer, posted a response in September that said that the Foundation has no plans to change the referrer policy. The Community Wishlist Survey is not an appropriate place to relitigate that decision. I'm going to archive this proposal. -- DannyH (WMF) (talk) 01:20, 21 November 2017 (UTC)[reply]