Wikimedia monthly activities meetings/Quarterly reviews/Editing/January 2015

The following are notes from the Quarterly Review meeting with the Wikimedia Foundation's Editing team, January 30, 2015, 11:00 - 12:30 PST.
Please keep in mind that these minutes are mostly a rough transcript of what was said at the meeting, rather than a source of authoritative information. Consider referring to the presentation slides, blog posts, press releases and other official material.

Present (in the office): Ori Livneh, Erik Moeller, Trevor Parscal, Rummana Yasmeen, Timo Tijhof, Jared Zimmerman, Gabriel Wicke, Terence Gilbey, Erica Litrenta, Rachel diCerbo, Abbey Ripstra, Ed Sanders, Lila Tretikov, Roan Kattouw, Tilman Bayer (taking minutes), James Forrester, Tomasz Finc, Alex Monk, Moriel Schottlender, Rob Moen, Damon Sicore, Rob Lanphier, Geoff Brigham, Daisy Chen, Kaity Hammerstein, Arlo Breault, Subbu Sastry, Elena Tonkovidova, Marko Obrovac; participating remotely: Kevin Leduc, Sherry Snyder, Marielle Volz, Dan Andreescu

Welcome, agenda, team intro[edit]

James: welcome

[slide 2]

Agenda

[slide 3]

Team intro
Erik: This includes Community Liaisons, designers, quality assurance, etc.
and some part-time

What we said[edit]

[slide 5]

(James:)
Last q, VE was not a quarterly priority for WMF as whole, instead we supported these two other areas

[slide 6]

auto-filled citations a very big usability improvement
table editing was long-time request, experienced editors often identified it as blocker for wider deployment
language support: problems with input methods

[slide 7]

editing perf, front-end improvements
optimistic saving (by Ori etc), now reimplemented for wikitext editor

What we did[edit]

[slide 9]

mixed picture
language support: green in the sense that it got better, but ultimate goal is still quite a bit away
ACTION: Lila: in the future, please set clear attainable goals for that timeframe
Toby, Erik: (on clearer goals in Product in general)

[slide 10]

(JamesF:)
-20% load time is neither good nor bad
Lila: again, should specify goal

What we learned[edit]

[slide 12]

(James:)
got very positive reviews from users, which hadn't happened (in that way) before
Lila: we should highlight these things when we get them. separately from measurements
Erik: how much design research went into table editing?
JamesF: quite a lot, e.g. we compared many exisiting table editors
Lila: when you get a win, should think about how to highlight that

Metrics & other key accomplishments[edit]

[slide 14]Performance
Erik: these numbers are from controlled environment
James: without Internet speed etc.
Lila: this is client-side load time?
Roan: yes
Ori: even more granular, CPU load, i.e. controlling for network latency
because that's not within control of VE team
goal is to monitor perf regressions
Lila: overall user speed is not within control of team, but team can optimize to mitigate latency issues
Lila: Why did it get slower in some cases?
James: e.g. adding table editing meant increasing amount of processing that is done
Lila: why not load incrementally?
Ori: not done yet, one of the biggest todo items on roadmap

[slide 15]

(James:)
3.5s median save time vs. 2.5s for wikitext editor
Lila: this is great

[slide 16]

Erik: rationale for optimizing for load time because that is what users notice first
James: to some extent it's not completely fair, because wikitext editor [loads less for empty page]
Damon: ...
Ori: it's VE initializing, from clicking "edit" until when VE is ready to receive input
Lila: should bucket (typical North American vs. Indian network connection, or such)
James: yes, can do synthetic benchmarks like that
HHVM improvements for wikitext users actually impacted these charts too
Ori: sorry ;)
Damon, James: (discussion about meaning of these percentiles)
Lila: should think about showing time indicator during load, or showing some useful tips
Ori: problem: text is blurred, i.e. the one thing that could hold your attention is decreased in visibility
perhaps instead show instructions?
Ed: could guess progress and display the guess
Lila: could optimize via ...?
Roan: (depending on Parsoid, RESTbase)
Ori: could guess when user is likely to edit (eg. view from watchlist, ..) and preload in such situations
James: blue line refers to people who clicked "edit" but didn't get further
Lila: so abort rate is about 8%? yes
should work on this
Roan: have some minimal data on how they abort
James: ...e.g. by clicking back button
Lila: how many people start typing?
James: bounce rate is around 40%
green and yellow are other end of funnel
difference: attempted save that did not result in actual save, e.g. when triggering captcha for inserted external links, or edit conflicts, or some of the abuse filters
Ori: also due to network latency.. and so people give up after pressing save?
JamesF, Roan: insignificant
Lila: so we need yellow and green to be one line?
James, Erik: except e.g. abusefilter, spamblacklist, …
James: heard from Wikia their VE has similar numbers
Roan: I thought our save numbers for wikitext were around 25%?
James: about that, but those numbers are from a few years ago
legal/social issues about incremental saving: currently authorize publication only when pressing save button
Ori: will need support from all of engineering for this
James, Lila: (discussion about difference betwen ptwiki and plwiki regarding attempted save rate vs save rate, perhaps due to link spammers)
Lila: can we try the new graphic CAPTCHAs and see if vandalism rate goes down?
Erik: readability of our captchas is about as good as the rest of the industry
Roan: have some architecture issues about captchas
for failed captchas, have to resend whole data again
Toby: why do we use captchas?
Ori: link spam
Lila: do we have a learning algorithm for these?
James: no, simple rule based system

[slide 17]

Lila: so bug fix rate is 20-30/week? yes
Lila: incoming bug rate?
James, Erik: higher, but includes feature requests
James: probably 80:20 bugs:features [?]
But focus on bugs means that the backlog is mostly features and enhancement requests
Lila: how much of that is parity features? [regarding wikitext editor]
James: maybe a third, and a third totally new ideas, and one third things that make sense in wikitext but not in VisualEditor (and so we need to find different ways to do them)
Lila: first third is most important, the others can be triaged
Damon: those [first third] will go into the blockers list we are compiling
Lila: need to watch that you don't accumulate debt on bugs
Erik: (on stats in the slide:) important that we get a lot of edits now, i.e. real life usage

What's next[edit]

[slide 19]

James: this is a tough deadline
Lila: should never think of this kind of deadline as "turn it on now"
instead phases (prompt users to try out first, then..), incremental rollout
Trevor: intentionally not set goal as "deployed everywere"
instead, focusing on getting the product ready for deploy
Lila: with complexity/magnitude of our community, one size fits all approach is not feasible
Trevor: goal of shown by default to new users (defined as new account creations)
Lila: can we run test on percentage of them?
Trevor: will rely on the Community Dept to design the metrics, but we have ideas about how it could work
James: confident that mobile performance has improved
Erik: did we test switching phones over to VE?
James: not yet, tablets only
Ori: This will be a *huge* engineering effort to make this work on mobile
Kaity: also, UX needs work
Roan: there are browser issues
ACTION: Lila: want to see completion rates VE vs. wikitext on mobile [?]
James: roll out to some smaller wikis first (after some small needed fixes)
Toby: clarify: launch means desktop
Lila: OK :/ should be gathering numbers on both though
Erik: have to start somewhere
Tomasz, Lila: start discovery work now though

(James:)
Planned process adjustments:
process so far has been reactive in terms of user engagement
"file a bug and we will respond"
Damon suggested to switch to holding weekly triage meeting with community
Lila: advertise how?
James: village pumps, mailing lists
Lila: suggest standard paragraph for invite, that people can reuse
multilingual outreach is important
Damon: this is not so much about new ideas, but about bugfixes and what is critical to fix
Lila: set a weekly schedule, expectation on timing
Erik: also pulling in engineering community team, to involve tech community too
Damon: shows VisualEditor Q3 blockers list on Phabricator
"unbreak now" basically means someone is working on it right now
Roan: e.g. "use parsoid html for read views too" has been worked on for months
Erik: it's an epic
Lila: in a list like this, expect items to be of comparable size (not epics mixed in with smaller tasks)
otherwise can't parse it (as management, community, interested public)
as product manager, should break them up
Damon: i'm actually pretty pleased of this result of our first meeting
Lila: agreed
Toby: share concern about sizing
Trevor: already talked and made plans about sizing, granularity
Damon: always adjust processes for team, not saying it needs to go fully Agile
Lila: any process is about outcome [not an end in itself], need to find what works for you
doesn't matter whether using points or other system
Erik: this is an opportunity to look at process toolkit
Trevor: not the first time that we successfully changed process for team
Toby: concern about load on James as team has grown
Roan: there is [going to be] a JD on release management...
Trevor: James has scaled very well, but there are limits
RobLa: is this more scrummaster?
Erik: recall all the non-VE stuff in earlier slides - these things were handled by James too

What's next[edit]

[slide 21]

James:
(draft for)shippable criteria
Lila: need to sharpen some of these
Damon: how do I know if a bug blocks a release? (acts out mock example)
James: (ask questions like "is it a security bug" etc.)
"data corruption" also encompases cases where VE does (saves) differently from what user expected
Damon: ok, ...
Erik: small UX annoyances are difficult to classify in this way
link editor/inspector worries me most
comes up often as example where users get confused
Trevor: ...
Kaity: links are one of the most common tasks for new users
Damon: are there bugs for this?
Kaity: there are some Phabricator items, not necessarily bugs
ACTION: Damon: could we get bugs filed?
James: languages - stretch, IMEs
Korean/Japanese deployment is not the goal for this quarter
Roan: many Chrome bugs preventing this
Damon: it's about quality, not speed of shipping
Trevor: link editor works already, we would rebuid it just for these UX issues
James: (Demos citation autofill)
Ori: this is awesome
Trevor: this was a lot of work technically
James: first as beta feature opt-in

Asks[edit]

[slide 23]

James: CL need might conflict with other teams
June 2013 pre-launch A/B test on enwiki was flawed
(Lila etc.: discussion about need for controlled test)
James: cleanup after bad VE edits
Erik: might have Guillaume lead task force on this
last time saw community complain about things we can't do anything about
(e.g. newbies making mistakes they would have made on wikitext editor too)
so must keep scope clearly defined
Lila: ...
Toby: (question to assess effort for Analytics team about this)
Erik: carrying out these tests is stretch goal for this q
(James:)
Parsoid is key dependency, especially for perf improvement
Lila: (question to Damon about prioritization)
Gabriel: want to reduce HTML size drastically
Lila: hear you need proj management support for this group and for external dependencies
maybe from Arthur's team
no time for hiring new people for this
Erik: can be done by one person, but yes, it's two areas
Damon; resource allocation will be enforced when we go into triages
Toby: will be difficult for TPG to allocate resources for this
Erik: yes, will need to talk