Experiments

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search
Vista-keditbookmarks.png
Just brainstorming. More on the talk page.

This is a guiding document on the use of experiments on Wikimedia wikis.

Current practices in Web analytics reflect their commercial origins. For better or worse, the greatest motor behind the use of Web analytics has been the profit interests of online retailers and social networks, for whom the user is a commodity. These profit interests have profoundly shaped the discourse of Web analytics, setting both the tenor and the tone of debate (consider the values implicit in "funnels," a term of art).

A thoughtless application of Web analytics to Wikimedia wikis would import a moral outlook that is incompatible with (and, indeed, rightfully offensive to) its community. It also wouldn't work well, because neither Wikimedia wikis nor their editing communities are for sale. It is therefore crucial that technical efforts be accompanied by a process of reflection, the goal of which should be to articulate criteria for Web analytics that express and promote the broader ambitions of the Wikimedia movement and the moral commitments that underlie it.

Background[edit]

Experiments on Wikimedia wikis, generally conducted by the Wikimedia Foundation, have become increasingly common in recent years. In 2009, the Usability team conducted a number of experiments, though the results were never publicly released. In 2012, there was a dedicated Editor engagement experiments team (E3), which has been disbanded in 2014.

Some elements of said experiments are worth noting.

  • They number in the dozens and are usually documented in the Meta-Wiki Research namespace.
  • Their outcome is often not used for any concrete deliverable, such as a merged change to MediaWiki core PHP code or a peer reviewed paper.
  • Sometimes changes which are known to be potentially harmful, and would never (or hardly) pass standard code review, are deployed as "experiments" to bypass tougher public scrutiny. (This is also valid of fundraising banners, whose poor translations since 2011 are often actively damaging to the public opinion and understanding of Wikimedia projects.)

These experiments have varying goals and motivations, but we've reached a point where some clearer guidelines need to be established about what is and is not appropriate. It's particularly important to have clear communication (like [1]?), to avoid any risk of getting a backlash as happened to Facebook in June-July 2014 (despite being approved by an ethical board of the researchers' university).

Principles[edit]

Dignity and collegiality of all[edit]

Anyone working with Wikimedia editors has to treat them as colleagues, not as customers. Experiments are often an attempt to optimize human behavior and workflow. There's nothing inherently wrong with such goals, but Wikimedians are not customers in the same way that users of Facebook are customers of its site. They instead should be viewed as colleagues. Would you go into the office of someone you work with and start messing with them to optimize their behavior? Surely not. But this is exactly the type of behavior the Wikimedia Foundation is now engaging in. Disrupting the work of long-time editors in the name of questionable experimentation.

Implementation[edit]

Mitigation[edit]

Experimentation comes with high costs and high risks. Adding extra weight by including people in a test group who ought not be tested on adds considerable cost and risk without any benefit.

Smarter code should ensure that only editors who meet specified criteria load extra code (JavaScript). A number of factors can be taken into account when determining whether to load extra JavaScript for a particular user, including:

  • user's logged-in status;
  • user's edit count;
  • user's registration date; and
  • whether the user is using the default skin.

Looking at and using smarter metrics is important. If a user is using a non-default skin, it's fairly safe to assume that they probably don't want to be fucked with. Experiments on these users should undergo the most scrutiny and require the most consideration.

Opt-out[edit]

Any and all experiments should have an opt-out feature. However, an opt-out feature is not a license to be more obnoxious simply because people can opt out of your experiment. Several years ago, the Usability Initiative provided an opt-out preference ("<vector-noexperiments-preference>") for development of Vector. The current experiments team respects this opt-out when it is relevant as well. Further reading at How can I opt-out of experiments?.

See also[edit]