Wikimedia monthly activities meetings/Quarterly reviews/Editing, October 2015

From Meta, a Wikimedia project coordination wiki

Notes from the Quarterly Review meeting with the Wikimedia Foundation's Editing team, October 7, 2015, 9:00am - 10:30am PDT.

Please keep in mind that these minutes are mostly a rough paraphrase of what was said at the meeting, rather than a source of authoritative information. Consider referring to the presentation slides, blog posts, press releases and other official material

Presentation slides from the meeting


Present (in the office): Neil Quinn, James Forrester, Rachel diCerbo, Leila Zia, Guillaume Paumier, Toby Negrin, Joel Aufrecht, Terry Gilbey, Luis Villa, Wes Moran, Jon Katz, Lila Tretikov, Elena Tonkovidova, Tilman Bayer (taking minutes), Trevor Parscal, Roan Kattouw, Danny Horn, Kevin Leduc, Nirzar Pangarkar; participating remotely: Subbu Sastry, Runa Bhattacharjee, Niklas Laxström, Mark Holmquist, Santhosh Thottingal, Tomasz Finc, Bartosz Dziewonski (until 9:28), Greg Grossmeier, Kartik Mistry, Geoff Brigham (until 9:23), Amir Aharoni, David Lynch, Ed Sanders (from 9:25), Katherine Maher (from 9:28), Prateek Saxena (from 9:29), Pau Giner (later); Couldn't join due to Hangouts attendee limits: Prateek (until 9:29), Ed (until 9:25), Pau (until later)

James opening the meeting

Collaboration[edit]

slide 2

[slide 2]

Danny:
stopped Flow feature development
released opt-in feature
Terry: # of total Flow users now?
Trevor: a lot of such questions are answered in the appendix

slide 3

[slide 3]

Danny:
Portuguese Wikibooks was most recent
Roan: Portuguese-specific UI translation issue (namespace conflict)
Agreed on a fix for that yesterday
Lila: so will be done this month? yes

slide 4

[slide 4]

Danny:
other goals: OOjs.UI
as first team to use it, we encountered lots of problems that now other teams won't have ;)
Lila: are we running estimates on [server] capacity for Flow and VE
James: for VE have 2 clusters... more than enough for 100%
Roan: roughly the same situation for Flow, our use of Parsoid is even less

slide 5

[slide 5]

Danny:
design of Echo popup: now two icons instead of one
in response to feedback, separated more/less urgent messages
Terry: catch performance bugs before deployment?
Roan: this one could only be detected once it had hit English Wikipedia; performance team has a Phabricator task for making it detectable earlier
--> Toby: let's discuss having alerts for this kind of perf regression (rather than having to monitor dashboard)
--> Terry: general takeaway: need to look at it earlier in the pipeline

Language Engineering[edit]

slide 6

[slide 6]

Amir: ...
Lila: be concious about the downflow of editing tasks
if we increase editing by 10% say, what does it do to increase workload [of patrollers etc.]
James: as part of our work to get more solid KPIs for Editing, started looking at editing workflow
demand / supply
e.g. making things easier in VE increases demand to patrol downstream
--> Lila: flagged to me at Wikimania: pressure on those community members [who do curation/patrolling] is going to increase
need to make it easier for them to patrol (like Dario's team's work on automated revision quality scoring)
but also, make it more fun
e.g. easy mobile patrol tool
James: looking at easy yes/no tool for apps already
Trevor: first we need to instrument it to give it a baseline, before making plans
work with community tech team since they're also working in that space

slide 7

[slide 7]

Amir: get people to translate more
had steady growth
Lila: do we have early indicator for impact of article suggestions?
Amir: yes, see chart at the bottom of the slide
Leila: we already know that it works
rate of content creation increased (clearly), we should just do it if the decision is that we want to work in this direction.
Trevor: to clarify, we are doing it this quarter ;)
Toby: just because it works in email recommendations doesn't mean it'll work in CX interface; just something to be aware of.
Trevor: also, there may be fatigue
Lila: still over email, or is this an app?
Amir: right now, suggests featured articles from other langs
needs to be integrated with interface
Lila: if we find out it works better with email, could start asking people to opt in
Amir: We're also thinking of processes like task lists (campaigns); we have designs for that already.
custom processes for groups, wikiproject

slide 8

[slide 8]

slide 9

[slide 9]

Lila: who's the audience for the parallel corpora item?
Trevor: 3rd party involved in machine translation. We're giving back to them, maybe as part of agreements etc. Rosetta stone, so they can improve their own tools.
Amir: important to contribute to the ecosystem of machine translation
explored mobile options (on request by Lila)
Lila: about mobile: is any of the mobile stuff making it to the roadmap? are you working on it in the next quarter?
Amir: We want to understand who we're making this for first. Experienced WP editors? Global North/Global South? etc. We'll need to sit down with you (Lila) & management to determine the direction we want to go in.
Lila: agree we need to understand the users first, and build a light prototype first.
but want us to keep pushing on this
Trevor: also, integrate VE in CX

slide 10

[slide 10]

CX server: had some issues last q (added by Runa: some gaps surfaced in cxserver and Services infrastructure workflow. We are rectifying this to align with Services infra in the same way other consumers like VE or Flow do)

Multimedia[edit]

slide 11

[slide 11]

James: ability to upload to Commons directly from the VisualEditor window (and the wikitext editor as well)
Lila: is it drag and drop?
James: no, but if you open "insert media" there is now both search and upload options
but planned
Lila: awesome
if you detect a feature that could have negative downstream effect, do you roll it out to everyone or gradually?
want to have mental model
Trevor: this particular feature is virtually unnoticable
so we can essentially dark-launch it, and then make it more visible
Guillaume: is it a separate workflow in terms of licensing etc?
James: yes, very much simplified (thanks to input from Legal), assumes simplest case (user owns image...) and refers user to more complicated standard workflow on Commons otherwise
Lila: performance measured, e.g. amount of code shipped to browser?
do we have goals on this?
CI system should flag it
Trevor: typically gatekeeping mechanism is that Ori says the number of requests should be fewer, and the payload should be less, and if any of those is going up you should have a really good reason
Ed: (demo of media upload feature)
for typing in caption, it segues directly into existing media dialog
James: can drag file into upload box, but not (yet) into article in normal VE view
Lila: am I able to drag and drop images within editor?
James, Ed: already possible
Lila: great. when did this roll out?
Trevor, Roan: a while ago
Ed: (demo of graph editing feature)
Lila: can I import e.g. Excel files?
James, Trevor: not yet, GSoC student (now under contract) working on such things
Lila: can I build a new visualization type (as community member)?
James: we use Vega 1, limited options (also security consideration), will upgrade to Vega 2
Trevor: so answer is, probably as member of the Vega community, not as Wikipedia editor
Lila: general question: how can we empower community (to add new formats) - we can't do all of it ourselves
Trevor: other directions where we could work on this: data format, currently JSON
James: exciting opportunities to use is e.g. articles about US places with population data
--> Lila: take this discussion offline, also about use of Wikibase etc.


Parsing[edit]

slide 12

[slide 12]

Subbu: reduce nowiki insertions (where Parsoid thinks different than human editors)

slide 13

[slide 13]

parsing latency matters in some scenarios e.g. when opening an article from recent changes list in VE, or when the HTML <-> wikitext switching feature comes to VE.
sometimes really hard to correlate deployments with performance changes
Terry: if we fork that library, are we going to merge it back or will we have to maintain it?
Subbu: The current maintainer has other plans for performance improvements and so our fixes won't be accepted right now. So, we have to maintain it, but it's not a problem per se

slide 14

[slide 14]

make sure Parsoid HTML can also be used for read views
did see improvement in % of pages rendering HTML from Parsoid perfectly identical to the PHP rendering
confident we are in a good place for moving to use it for read views
Terry: final goal?
James: serve (read) HTML pages directly from Parsoid, and not from PHP Parser output
Trevor: and this (= visual diffs) is one way we measure progress towards that long-term goal

slide 15

[slide 15]

Subbu: April 2015 perf problem with a page that had lots of images -- exploring batched API requests to M/W API was one of the action items.
Lila: perf is important
James: HTML used for VE edits is cached usually
Trevor: we do not add a second to every edit
James: we did a while ago, but we fixed that ;)
Subbu: also long term goal: incremental parsing (inefficient and unnecessary to reparse the entire article whenever someone fixes a typo on a page)
Trevor: ... which will dramatically reduce overhead

slide 16

[slide 16]

replace Tidy

VisualEditor[edit]

slide 17

[slide 17]

James:
since 1 September, VE enabled for all new users on enwiki (after community discussion)
Lila: so they still see two tabs for wikitext/VE? yes
so 16% is the rate where it is typically plateauing? yes
Why do we think that is?
Trevor and others: be aware this is per edit, not per user
many of the most productive users are [used to wikitext and are] not using VE (which is OK)
so this percentage should not be seen as discouraging
Lila: we should track this number - the better VE becomes, the more this percentage will rise
Trevor: we will also see people who start with VE graduate [to become power users]
ability to switch from wikitext to VE will be useful for existing power users (e.g. do most of the editing in wikitext, but then want to use VE to edit a table or add a citation)
James: also want to call out that Portuguese WP (with high VE rate) is mostly edited from Global South, in contrast to earlier concerns the VE would be only usable for GN people with state of the art computers
Lila: I would also guess that people who came online later might be less tolerant to outdated interfaces
(discussion about possible community effort for VE campaign on Dutch Wikipedia, coinciding with Erasmus award)
Subbu/Ed (comment in chat): https://phabricator.wikimedia.org/T51400 is a blocker to a deployment to nl.wikipedia.org
Rachel: Dutch Wikipedia has some technical contraints around templates, working on that

Lila: congratulations, this is what you worked on for years

slide 18

[slide 18]

James: (code quality)

questions? we can also go through the appendix slides

slide 19

[slide 19]

Lila, Trevor, others: (discussion about how table format is confusing)
Neil: new editor activation rate rose because registration rate (denominator) dropped
looking to replace this
active editors bump was not sustained
but very active editors looking good recently
--> Lila: now that you track these numbers, look at how to influence them
we should also track all editors (1 edit)
James: total edit number kind of addresses that
Lila: should filter out bots in # of edits
--> Lila: I know we are all text-focused people, but this would be so much better with graphs
Trevor: just followed the (quarterly review deck) template

slide 20

[slide 20]

--> Lila: # of people on Flow is interesting, but what we really want to measure is how useful it is
e.g. are users talking more? or something like that
Trevor: yes, but these metrics were more meant to capture adoption rate
Toby: the acq, activation, retention model is pretty generic, can be used here too [cf. https://meta.wikimedia.org/wiki/Research:Metrics_standardization ?]
Roan: James and I have plans for other possible metrics for notifications, perhaps next q
Lila: there's a lot of things I'd like us to experiment with in Flow

slide 21

[slide 21]

Amir: since the beginning of FY, rate of new articles created with CX has been stable, around 1000
(discussion about low deletion rate)
Luis: how does number of articles created with CX relate to the overall number of new articles?
Amir: don't know precise ratios off-hand
James: also, on some small wikis CX might be bringing in a much more significant ratio of new articles
Pau: remember it's not widely visible as a beta feature, and there are constraints (e.g. the user needs to be able to read/write in two languages) so the % compared to /all/ created articles isn't an apples-to-apples comparison.

slide 22

[slide 22]

James: general feedback?
Lila: format much better, focus of team has improved greatly
encourage you to focus on impact
--> Neil: many people tried to get in on hangout and couldn't due to attendee limit (including members of Editing, Language and Collaboration teams), should use BlueJeans next time