Thanks for sharing your thoughts! I find this really interesting, and it's something I think about a lot with regards to New Readers. When considering how to structure our work and how to think about readership in the context of the larger Movement/Foundation goals, we've been diving into logic models of different kinds and how they relate to a theory of change. If we can develop an understanding of the impact we hope to have in the world, and how we hope to achieve it in the context of the real world, then we can draw out cause and effect relationships between what we do, what we make, what happens, and outside forces. More than that, we can use evaluation to help us understand how these pieces are related and where our understanding is flawed so that we can revisit those assumptions. I won't get deep into where we've landed on that, but it has gotten me thinking about the broader Movement and how the many pieces and people fit together.
With that framing, I really like how you've pulled out the primary and secondary impact pieces. I've been thinking about if it might be worthwhile to further tease out how the different parts (and therefore measurements) relate to each other and what levers we (as individuals, teams, organizations) can pull that will have a ripple effect... I haven't made the time yet to take a serious pass at it, it's a huge undertaking. I hope that the movement strategy helps us build this kind of understanding. AGomez (WMF) (talk) 21:07, 15 March 2017 (UTC)
- Hi AGomez - thanks for your comments! Out of interest, is your team's work on logic models and theory of change public anywhere? It would be interesting to read it. And yes, the movement strategy process will doubtless think about impact one way or another. I'm hoping that it's explicitly discussed, one way or another, rather than simply being implicit in some of the conclusions. Chris Keating (The Land) (talk) 14:51, 16 March 2017 (UTC)
- Hey Chris Keating (The Land)! The outcomes from that work are in our annual plan, which follows a similar (pared down) structure. It doesn't include all the measurements we'll be using, but some top-line deliverables and checks. That'll go out April 1 (but I think you should have access to the drafts across the org already). Here's a link to the doc we were working from. It's much more a scratch pad than something intelligible, which is why I haven't published it beyond our team, but it definitely helped us develop our thinking and alignment as a team, and spot the gaps in assumptions. AGomez (WMF) (talk) 15:52, 16 March 2017 (UTC)
Apparently there was a research proposal way back in 2014, which contains a decent list of the (relatively) easily quantifiable factors. For most of them we have actually built the measurement infrastructure since then (content persistence metrics being the big exception), but we have put very little work into making that infrastructure useful (e.g. how do you measure the effect of an editathon that expanded several articles? we have the raw data available, but that is only a part of the story.) --Tgr (talk) 22:09, 30 December 2019 (UTC)
Visual Editor metrics
Important that technology developments are still impact-driven but this still faces challenges of measurement. For instance, Visual Editor should make a big difference to the experience, happiness and productivity of new contributors. Is there a framework to assess how good it is at doing this?
The answer is Yep! This was done.
The WMF did a controlled study. For one week all new users were randomly split into a control group and an experimental group. Half were given just the EDIT link to the wikitext editor, and half were given an EDIT link to Visual + EDIT SOURCE link for wikitext. Research:VisualEditor's_effect_on_newly_registered_editors/May_2015_study
The short answer is that that VE completely failed to provide the hoped-for benefits. The results were either no-effect or advantage-wikitext. VE assisted an additional zero% of new users to make a first edit. There was zero increase in short term retention of new users. (And at my request, a later examination found no change in medium term retention of editors.) There was zero increased total contributions. There was no change in how many users got blocked. Users given VE were significantly more likely to abandon edits without attempting to save, and when they did attempt to save they were significantly less likely to do so successfully. Making a successful edit with wikitext typically required 35 seconds. Making a successful edit with Visual editor typically required 2 minutes.
There is (or was) a second proposed 2016 study on the effect of defaulting new users into VE editor vs defaulting them into wikitext editor. However hasn't been done, at least not yet.
In follow up to the 2015 study I suggested/requested a more in-depth examination of the data, for VE usage over time. (Do users begin in a random editor, then settle on the "better" one over time?) Unfortunately that part of the data never got analyzed. However we don't really need to look at the study data for that. Since the study, we have general global editing data. For years now a substantial percentage of new users have begun in VE, trying the "Edit" link rather than the "Edit Source" link. When we look at global editing statistics, approximately 95% of all edits are made using wikitext and about 5% are done using VE. Either people who start out with VE overwhelmingly shift to wikitext as the primary/better tool for the job, or people who used/preferred Visual just don't stick around as significant long-term contributors.
Visual editor was built on the theory that new users were being scared off by wikitext. It was built on the theory that VE would make editing easier, that it would bring in more new people, that it would improve new editor retention. That theory turned out to be wrong.
Now that we do have VE, some people do prefer it and many people (including me) find it valuable at least for occasional specialized edits. Ok, that's fine. However I've been seeing serious problems with the WMF's direction regarding Visual. The original theory was very appealing, it offered an actionable plan that VE would make things easier for new editors, that VE would be the holy grail of bringing in more people. A lot of people dedicated a lot of hard work into building VE. They invested a lot in trying to make that vision into a reality. The problem is that general WMF-culture doesn't seem to have accepted that the original theory was wrong. It seems some staff (or at least some of those in charge) have been making well-intentioned but harmful decisions, based on a long term investment in that theory.
If you are interested, I am more than happy to cite specific examples where this is causing problems. But addressing multiple examples would turn this post into a wall of text, and this post is already getting sizable. Alsee (talk) 23:00, 9 May 2017 (UTC)
- Hi Alsee! Hmm, that's interesting - hadn't seen it. Certainly looks like good evidence that VE as of 2015 was not having the effect we'd hoped for. Though I wouldn't yet throw out the hypothesis that wikitext editing puts people off and that visual editor is part of the solution; if between 2013 and 2015 VE went from significant negative to neutral, that might suggest that by 2017 there's a chance the product has further improved to be a net positive; and I'd want to look at other evidence around the overall hypothesis. And on the broader point, good that WMF is rigorously testing the impact of its features on outcomes - though is this really happening only once per 2 years for features this significant, and if so why? (I imagine I could find some of the answers myself with a bit more reading time :) )
- I'd be interested in your views on WMF's overall direction on VE and any specific problems you've seen, though maybe that would be better on my main talk page, to help keep this one focused on impact and metrics. (You've probably noticed I'm standing in the Board election? Just to be clear, if I'm elected I'm not going to start handling feedback on specific features, as that's not an appropriate role for a Board member, but I would nonetheless be interested to hear your thoughts). Chris Keating (The Land) (talk) 09:28, 10 May 2017 (UTC)
This essay largely feels like an overview of criteria to assess how valuable is a specific Wikipedia content item, with an appendix on prerequisites. I liked the mention of uniqueness and the reductio ad absurdum on whether one should only care about English Wikipedia popular articles. --Nemo 16:31, 13 May 2017 (UTC)
- Yes, it sort of is, on the assumption that you can sum the value of all Wikipedia content items and end up with a sum for the whole Wikipedia - and also on the assumption that similar criteria apply to other projects (I had Commons in mind when I wrote it to an extent, but not much else. Wikidata in some ways doesn't fit the model I'm using in this essay). Chris Keating (The Land) (talk) 18:42, 13 May 2017 (UTC)
I think this discussion of the WHO priorities is instructive, in a way: http://www.politico.eu/article/bill-gates-who-most-powerful-doctor/ Note how Bill Gates is "blamed" for putting emphasis on certain actions which have measurable outcomes, because other people perceive other problems would be more important to address. --Nemo 16:57, 13 May 2017 (UTC)
- Quite - a large part of my point with this essay is that there are things we should (and probably could) measure but so far don't measure, or at least not well (this has been particularly informed by conversations about FDC bids and those global metrics ;)). Certainly worth noting that not everything desirable has a neat measure associated with it. Chris Keating (The Land) (talk) 18:46, 13 May 2017 (UTC)
That's a great essay, thanks for writing it! I only found out about it after working on a movement strategy recommendation which arrived at very similar conclusions.
One thing that I think is underemphasized is accessibility (in the wide sense). It's not enough for the article to contain the relevant information, the reader must be able to learn that information from it for the article to have impact. (The criteria for that are of course going to be very different for different readers.) For example, mathematics is pretty well covered in English Wikipedia according to its own metrics, but most participants of this recent Twitter thread seemed to agree that that coverage is not actually useful to the average reader. --Tgr (talk) 22:01, 30 December 2019 (UTC)