Wikimedia monthly activities meetings/Quarterly reviews/Editor engagement experiments/2013-01-15

The following are notes from the Quarterly Review meeting with Editor Engagement Experiments (E3) team on January 15, 2013.

Present: Ryan Faulkner, Dario Taraborelli, S Page, Terry Chay, Maryana Pinchuk, Steven Walling, Ori Livneh, Howie Fung, Sue Gardner, Tilman Bayer (taking minutes)
Remotely participating: Erik Moeller, Matthew Flaschen, Munaf Assaf

Please keep in mind that this is at this point more a transcript of the meeting than deep notes replicating what was discussed.

Rough agenda proposed[edit]

The following agenda was proposed by Erik prior to the start of the meeting.

Brief team intro and recap of team's activities through the quarter, compared with goals
Drill into goals and targets: Did we achieve what we said we would?
Review of challenges, blockers and successes
Discussion of proposed changes (e.g. resourcing, targets) and other action items
Buffer time, debriefing

Introduction[edit]

Howie: explaining what this meeting is about... review of E3 team's work during the last quarter, and what we learned about editor engagement. Also: possible blockers.

Presentation of team goals and activities[edit]

Steven Walling and the rest of the team guided us through a presenation of the team's activities, goals, and related data:

Plan for the previous quarter (October-December 2012) was: Focus on low hanging fruit: -1 to 5 (potentially 10) edits. Projects worked on: Post-edit feedback, account creation (ACUX), Onboarding (aka GettingStarted, Thank you campaign

Post-edit feedback[edit]

Minimal discussion, but a reminder of where the feature is deployed to. This was an experiment already finished by the start of the quarter, but which was productized in the Oct-Dec. period.

Account creation (ACUX)[edit]

1.5% increase in conversion for v1, 2% for v2, but 1.4% decrease in users who edit in first 24h. Howie: Overall impact on editors is neutral at this point. For v3 (most engineering effort): +4% conversion, 14% decrease in errors (form submit errors of various types).

Erik: Overall impact on editors?

Dario: ACUX is mainly just about "acquisition"; participation afterward was not a focus

Howie: less detailed analysis on v3 impact, but likely similar to v2

Steven: Even for an aquisition experiment, impact on retention is something we need to be aware of, to control for unuintended effects.

Onboarding[edit]

"Get started with editing", presents a list of tasks

After launch (on enWP), increase in number of users who attempt to edit

Statistically significant increase with launch of GettingStarted (21.7% vs. 16.5% editing mainspace in first 24h), but can't establish a causal relationship yet

Howie: Still, a very encouraging result (confounding factors seem unlikely to have cause such a large change). We can materially move the active editors number

Erik: Proven we are able to get new editors into a mood where they are inclined to make first edits, but not established yet that we can move the needle on sustained active editors

Steven disagrees

Dario: Lots of opportunities, employed a crude SuggestBot-based hack

Terry: We all agree - a potential goldmine, but yet to realize potential

Steven: Conversion funnel analysis: Task list --> view task --> edit --> save

Howie: Funnel analysis allows us to break this up into different UX problems. (e.g. edit/save: problem for VisualEditor)

Steven: Some little things we can do pre-VE, e.g. avoid articles with big infoboxes on top

Maryana: Onboarding editor quality: Vandalism ratio is about the same

Steven: 3 levers: task type, read mode, edit mode...

Howie: What would be needed to halt the overall decline? about 2000 more active editors YoY on enWP (Annual Plan : flat +1k)

Over 1 month, ACUX+GettingStarted should yield 5-6000 more 1+ editors

GettingStarted much bigger lever than ACUX (ACUX works "around the margins", GS applies to all new users)

Thank you campaign[edit]

Dec 27-Jan 1

1107 signups, vs. 20,958 organic ones, edit rate: 20% vs 29% (but: this was a bad time of year for new signups)

Conclusion: Integrating contribution campaign in FR not a "silver bullet"

EventLogging[edit]

mw:Extension:EventLogging

"reliable, real-time data on how users interact with MediaWiki's interface" (vs. legacy ClickTracking). Talked about advantages of schemas (e.g. Schema:AccountCreation ). Data model living on wiki.

Ori:

EventLogging is a platform for modelling and collecting metrics on how readers and editors interact with Wikipedia.
It allows us to define a set of datapoints that should be sent to our servers whenever users perform some action.
Single definition of data structure used for experimental design, collection, data validation, storage and analysis.

Design principles
- Nudge users toward good experimental design.
- Validate and sanity check data early and often.
- Minimize chance of programmer error.
- Provide tools and infrastructure to support a workflow based on peer-review and sign-off.

Steven, Terry, Dario: It facilitated cooperation between analysts and engineers.

Dario: also, adds transparency

User Metrics API[edit]

Research:Metrics

(work with Ryan F and Aaron) "comparing cohorts in a standardized and automated way". Ryan gave a demo of the API which is currently only for internal use.

Ryan: Based on cohorts (user groups from various experiments)

Motivation: E3 jobs...

Steven: Facilitated ccoperation

Howie: e.g. used for GettingStarted

Not yet available on non-enwiki.

Howie: what Grantmaking and Programs needs for evaluation e.g. of outreach events

Dario: metrics would need to work across wikis

Steven: enwiki restriction is implicit in E3 approach - experiment on enwiki, then roll out to other wikis (but not necessarily testing there)

Ryan: Standardizing metrics was important part of work

[break]

Roadmap / discussion[edit]

To wrap up the presentation, Steven and the team then presented the results of two product planning meetings for the upcoming quarter (Jan-March 2013).

Productizing GettingStarted[edit]

work on task types offered

Apart from copyediting, other easy tasks: e.g. add links

Sue: Takeaways from Jack's work at WikiHow?

Steven: 1. .. 2. doesn't quite cover diversity of possible tasks people might be interested in 3. teaching with fake ("clean") tasks decrease conversion afterward when users go into "real" world

Improve article selection- collaboration with Suggestbot was effective but can be optimized

GuidedTour[edit]

mw:Guided tours

Admins can create tours in MediaWiki namespace

Useful application of Echo?

Launch on other wikis this month: frwiki, ..

Ori, Steven: has also led to improvements in MW core and Translate extension (by Matthew)

Howie: should make GS task list into a persistent space

Howie: e.g. task notifications could use Echo platform

Terry: E2 builds levers, E3 figures out how to pull them

Some discussion about the extent to which E2/E3 focuses on new vs. experienced editors, and how E2/E3 work connect.

Ori: among achievement in past quarter: getting some community buy-in among experienced editors

Sue: E3 focus is still on increasing number of new editors, as of now

User Metrics[edit]

Dario: EventLogging:

more robust campaign support
real time monitoring and alerts (Ori: created new mailing list for such alerts, also: integrate data integrity issues)
automated analysis
documentation

Ori: work with analytics team (Dan) on integration with Linm

Terry: Client side (calling bits..) vs. MW hooks

User Metrics API (Dario)

with analytics team: Cohort metrics viz (Linm)
Public release
User tags repository

Erik: Proud of E3's work, important. We do need to increase velocity on user inventions that get new users editing. What I'm missing is an overall hypothesis on how the user experience will change for new editors. Worry that without it, some aspects slow us down, e.g testing for metrics we don't need.

Steven: Agree velocity on UI side could be higher, but developer time was limited. But not lack of comprehensive concept of UX

Dario: Narrowing focus to low-hanging fruit was good

Work on building infrastructure made sense, but slowed iteration of experiments (recall earlier 15 exp /year idea)

Erik: priorities on data/analytics side made sense, but on feature side priorities may need more sharpening. New editor workflow will be most important game changer, so would be great to zoom in on that.

Steven: agree on goals, but disagree on lack of grand vision

Sue: that's not what was said

Howie: we didnt close the loop on how E3 works with e.g. Echo

Sue: did not assert integration with E2 should have happened sooner, but that it can happen now. Rather than lack of vision, Erik said hypotheses on numbers were missing

Howie: metrics etc just provided the scaffolding for the conversation you are suggesting

Sue: lot of ground was laid, lots of good things happening. but: historically lot of flailing around, lacking tools and knowledge. now a lot of that work done. serious promise, game changer potential. Do we feel success is in sight?

Howie: can't be answered right now (perhaps in 2-4 weeks). missing: how many go from 1+ editor --> 5+ editors (right now assume ca. 30%)

New registrations: Historically seasonal

Opening up this funnel maybe not that promising (currently self-selecting, i.e. additional users might be less qualitied/motivated, Sue: maybe after VE

Dario: how does increasing one part of the funnel affect other parts of the funnel

Erik: how does GuidedTour fit into the flow?

Ori: dont have infrastructure right now to support large...

Sue: active editor def (5+ edits) not set in stone

Erik: nothing wrong with it as measure for WMF's sucess, but need more granular metrics for team-level results

Sue: This was the 1st quarterly review (ever), overall super useful. will give some feedback to Howie and Erik. E3 team seems in good shape. Seems authoritative in a way it wasn't before. Go to board 2 weeks from now. would like to be able to say we actually have needle-moving stuff. Board and community want to be able to be optimistic. started caring about the problem around 2009. Want us to be honest about these numbers, laser-focused on turning them around;

Ori: not sure about way numbers are positioned as incentives of org's/team's work

Howie: Supports data-driven approach, but feels like massive acceleration for this team, feel enwiki alone not enough, need at least top 5 wikis need sharper focus around other teams' work, e.g. Echo: how will it work with this team's work?

Q to team: assume (hypothetically) that you are the only team who can achieve this WMF goal, what do you need?

Steven: at least one backend engineer, and..

Dario: Seen a lot of investment in projects that are deployed while not fully optimized / explored

Steven: Feel OK with focus on active editor #, but it must be a conscious decision

Sue: agree it's a hack

Erik: Howie & team did good work on narrowing focus on the part of the funnel where we can meaningfully gain new editors. (We get many new accounts every day who never edit, so let's start by trying to convert more of them.) But we're focused on the user experience that exists today, and incrementally adding to it (GettingStarted). How will the user experience for new editors change as a whole? For example we might have some components that are relatively fixed, e.g. a task assignment system, while we do a lot of experimentation around what kinds of task work best. Want to better understand the future scaffolding.

Terry: re what needed in next 6 months? (aside from new hires). There's a large opportunity cost for all we do, e.g. section edit link work. It's not that AXUX or PEF don't move the needle, but probably more value elsewhere. So we have to make very careful choices

Howie: we make a bet, are foregoing the unknown unknowns

Sue: Focus on numbers. Base for optimism on Guided Tours?

Howie: Sue and Erik OK with team thinking more about resources needed?

Sue & Erik: OK, of course :-)

Post-meeting notes and conclusions[edit]

Notes from Howie[edit]

Published verbatim from team email list, with permission

Everyone,

First of all, I wanted to say everyone did a great job at the E3 Quarterly Review. Erik and I (or both) will provide more specific feedback on the meeting itself, but at a high level, both Sue and Erik told me that they found it useful. Like I mentioned to some of you, I think the discussion we had on Tuesday was the type of discussion Sue has wanted to have for a long time [1]. The team should be proud of the work it's done. Though there is a significant amount of work in synthesizing data, interpreting it, drawing conclusions, writing the presentation, etc., the review itself is only the tip of the iceberg. It's built on top of all of the work the team has done over the past 3 months, e.g., feature development, analytics work (both event logging & metrics api), user testing, etc.

I want to give my take on what this means for the next quarter, drawing on the review (and specifically the results we've seen so far) and the thinking done during the quarterly planning session. As you can tell from the review discussion, both Sue and Erik want us to be more focused on the active editors target [2]. It’s a hard number to move, and we can do almost everything right and still not move it -- but it still needs to drive how we focus our work.

I'm currently working on revising the Foundation-level Annual Plan targets with Diederik and Erik. I'll also need to work with this team (Dario, Steven and others) to come up with E3-specific targets. But for now, I think we should assume that by end of June 2013, we will need to reverse the decline, which means contributing on the order of an additional low-thousand digit Active Editors per month. Again, this is a hard number to move, but we need to set our sights towards a goal.

Given that, I think we should look at the focusing our work over the next quarter along the following goals:

Getting registered users from 0 to 1: Increase the number/percentage of registered users that make at least one edit (1+ ns0 in first 24 hours).
Getting editors from 1 to 5: Increase the number/percentage of 1+ edit users that make it to 5+ edits.

These goals would then align with the feature-level ideas we talked about during the quarterly planning session:

Getting from 0 to 1: Getting Started Page: Continued development on the Getting Started page, experimenting with different UX implementations, task types, etc.
Getting from 1 to 5: The Next Task: We discussed presenting the user with another task once the first task is completed, maybe through PEF, maybe through some other means. This is related to the Jack Herrick Flow concept. The problem right now is that after the user completes their first edit, we send them to a contributory dead end. The more clever ones appear to back-back-back to the Getting Started page, and they shouldn't need to do that.
When the user returns: We discussed creating a persistent space where users can find more tasks upon returning to WP.

I would suggest that all the feature ideas, both front-end and back-end, be viewed through the lens of those two goals. Maybe not exclusively viewed, but I do think as a sanity check, it would help if one of the evaluation criteria for a candidate feature is whether it helps meet the two goals (which includes our ability to measure). Specifically:

Productization of task generation: How do we scope the feature in a way so that in the near term, it helps move users from 0-1 and from 1-5?
Guiders: How important is Guiders when it comes to these goals? How can we scope the feature/implementation so that it helps move users from 0-1 and from 1-5? etc

Also, there are two other areas that come to mind:

Monitoring of these users groups for vandalism, and if things get out of hand (hopefully they won't) thinking of ways to counterbalance. Given the folks that are on this team, I don't think this will be a problem, but I did want to call it out because it will take time/energy.
Starting to think about what it would take to roll these features out to non-enwp projects. This could easily balloon, and is related to the productization of task generation, but I think a focused session on at least identifying the potential issues would be worthwhile.

We should talk about this IRL, but I wanted to lay out some of my thoughts for everyone to consider/discuss.

Howie

[1] Quick feedback -- both Sue and Erik did want to spend more time on the connecting the work of the team to targets, but that's something we'll talk about more. [2] I don't think this applies to just the E3 team, though you're the first team with the ability to reasonably start measuring progress against this target, imperfect as though measurements may be.

Other notes/thoughts: Even though we're using targets to guide our work, I think the team should continue to balance the short term and long term outlook. I appears as though the team has been walking this line appropriately, but I think Terry would have more insight/guidance into that question. Also, regarding the path forward, I'm not suggesting that we revisit the results of the Quarterly Planning. Rather, I’m suggesting a way by which we sharpen our thinking as we go through the quarter.