Talk:Requests for comment/User site behavior collection

From Meta, a Wikimedia project coordination wiki

Preliminary thoughts[edit]

Hi. Thanks for starting this page. Have you read/digested this page: experiments? --MZMcBride (talk) 02:09, 27 February 2013 (UTC)[reply]

I actually hadn't read that page, so thanks for the link. I've gotten most everything covered I feel except the opt-out part. As this is experiment is targeted at non-logged in users I welcome any thoughts on how I should go about allowing an opt-out. I feel the most obvious option would be to add some text in a banner -- but that may be disruptive in and of itself. I could perhaps only do this on HTTP access -- as presumably anyone on HTTPS cares more about their privacy?
Addressing the concerns that this experiment commoditizes the reading community -- while there is merit in that argument, my true intent here is to attempt to minimize the disruption that fundraising causes. I, along with the rest of the team, would love to not blast people with banners. Mwalker (WMF) (talk) 06:19, 27 February 2013 (UTC)[reply]
Isn't there a "Do Not Track" header now? Maybe something to consider. --MZMcBride (talk) 23:42, 9 April 2013 (UTC)[reply]
I certainly can respect the DNT header for browsers other than IE, but unfortunately MS decided to set it by default thus rendering it pretty useless on that browser. Mwalker (WMF) (talk) 23:56, 9 April 2013 (UTC)[reply]
The DNT header is aimed at tracking by third parties, not by the first-party site the user is deliberately visiting. (We could, I suppose, treat it that way anyway, but it wouldn't be spec-compliant.) For more details, see the draft W3C spec. LVilla (WMF) (talk) 23:59, 9 April 2013 (UTC)[reply]

Hey Matt. The page MZ pointed out hints at something which I think is extra confusing about this request for comment... if you're talking about generic "user behavior" data collection, WMF has been doing it for years, and the framing of the discussion makes it sound like any data collection (anonymous or not) about users would be something new. If you want more feedback, I'd suggest paring down the page to specifically ask what I think you're asking: can we track and aggregate visitor sessions in general, to support the fundraiser? I see you have a motivation section, but when it comes to testing and data collection, it really really helps clarify what we're doing and why if you ask at least one clear question. Posing a hypothesis or the like would help us figure out whether the data you might collect is really necessary to do so, and whether the community agrees it's worth it. Also, it would help if you made it clear whether you're also asking WMF legal or not. Steven Walling (WMF) • talk 19:50, 5 March 2013 (UTC)[reply]

Steven, thanks for your comments. In this case I choose not to ask for generic permission. I wanted to be very clear about what I was asking permission for -- which is the aggregated statistics. I will be updating the page momentarily after feedback from Dario about the time length of the study and collection methodology. I also plan to update the motivation section to include hypotheses and why I need to do this specifically to gather the data. Mwalker (WMF) (talk) 08:18, 8 March 2013 (UTC)[reply]

Further thoughts[edit]

Hi. I finally had a chance to read this page. Looking at the collected data, I couldn't help but think the whole proposal felt a little like "let's put Google Analytics on the site for fifteen minutes." That doesn't seem to be what's actually being proposed (thankfully), but collecting this type of information and studying it at such a close level... I don't know.

The retention section doesn't seem to actually answer how long this data will be retained. Presumably there will be the raw collected data and then aggregated/summarized data from that, right? This should be clarified.

It would also be great if you could add a sample or example of raw data that's being collected, so that people (including me) can get a better understanding of what we're talking about here.

I've cleaned up some of the page and I'll likely do a bit of further cleanup. What's the timeline (broadly, even, if a more specific one isn't available) for this RFC? --MZMcBride (talk) 00:23, 10 April 2013 (UTC)[reply]