Wikimedia Foundation metrics and activities meetings/Quarterly reviews/Services/January 2015

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search

The following are notes from the Quarterly Review meeting with the Wikimedia Foundation's Parsoid and Services teams, January 29, 2015, 11:30 - 12:00 PST.

Please keep in mind that these minutes are mostly a rough transcript of what was said at the meeting, rather than a source of authoritative information. Consider referring to the presentation slides, blog posts, press releases and other official material

Combined quarterly review of:

  • Parsoid
  • Services (RESTBase, Auth service, OCG, Mathoid, Citoid etc)

Present (in the office): Roan Kattouw, Ori Livneh, Jared Zimmerman, Marko Obrovac, Rachel diCerbo, Andrew Otto, Ellery Wulczyn, Yuri Astrakhan, Trevor Parscal, Toby Negrin, Gabriel Wicke, Subbu Sastry, Erik Möller, Marc Ordinas i Llopis, James Forrester, Rob Lanphier, Bryan Davis, C. Scott Ananian, Tilman Bayer (taking minutes), Damon Sicore, Tomasz Finc, James Douglas, Arlo Breault; participating remotely: Greg Grossmeier, Chris Steipp

Agenda:
Team intro - 1-2 minutes
Parsoid - 10-12 mins
Services - 10-12 mins
Discussion - 5-10 mins

Presentation slides from the meeting

Welcome, team intro[edit]

Gabriel:
welcome

slide 3

[slide 3]

Team:
James and Marko joined, previously Services team had been just Gabriel

Parsoid[edit]

What we said + What we did[edit]

Subbu: ...

slide 6

[slide 6]

main goal: work towards supporting VisualEditor; after that, Parsoid HTML views

slide 7

[slide 7]

CSS customization: done, needs some additional work before deploy

slide 8

[slide 8]

improved on <nowiki> issues, got a lot of community complaints there

slide 9

[slide 9]

What we learned[edit]

slide 11

[slide 11]

CSS customization of Cite extension not controversial among devs

Metrics and key accomplishments[edit]

slide 13

[slide 13]

What's next[edit]

slide 15

[slide 15]

slide 16

[slide 16]

stable ids: stalled, but picking up again this q

slide 17

[slide 17]

Asks[edit]

slide 19

[slide 19]

stretched thin, need help from either other teams or new hires
did a lot of CSS work within team, but we are not experts
ErikM: ...
Damon: this about CSS rendering?
yes
Trevor: also needs to be familiar with how Wikipedia uses CSS
CScott: ...
Damon: OK, was wondering if this is about special engineering skillset
Gabriel, Roan: not rocket science
Ori: could farm this out to mobile team, they have lots of CSS experience
CScott: lot of false positives in visual regression tests; tweaking the CSS to get rid of these would help a lot
secondly, check our output is consistent with mobile
Damon: (question about long tail)
CScott: don't necessary aim to reduce long tail to 0, but remove as much as possible that creates issues for testing
ErikM: map out the skillset you need, and Damon and I will look at it
Damon: designate VE blockers
does successful VE launch have Parsoid blockers? yes
Damon: are these designated clearly? yes. See VE Q3 and VE Q3-stretch / Q4 columns on https://phabricator.wikimedia.org/project/board/487/

Services[edit]

Gabriel:

What we said & what we did[edit]

slide 22

[slide 22]

Ops told us hardware not yet ready for RESTBase v1 deploy in Q2
So we used that time for frontend improvements instead

slide 23

[slide 23]

Tooling, infrastructure: long term thing
depend on RelEng and Ops
Damon: objective?
Gabriel: make it easy to deploy stuff
and e.g. that someone who develops new service knows what to do
right now, it's one-off, pinging people on IRC etc.
Toby: come up with...
CScott: have 4-5 deploys so far
RESTbase, Parsoid, Mathoid, Citoid
so now is a good time to [document/specific workflows]

slide 25

[slide 25]

(Gabriel:)
"RESTbase is like Varnish, but with storage, and richer interaction with backend"
Damon: so it's like a cache?
Roan, Gabriel: yes, basically a cache that never expires
ErikM: what is actually going to become available in February?
Gabriel: content API (HTML + metadata for each current revision and those requested)

What we learned[edit]

slide 27

[slide 27]

collaboration with Ops, Sean very interested, but has lots of other things on his hands
Dev Summit has cleared the air a bit re SOA
ErikM: consensus is basically about moving forward with new services, not necessarily about converting existing code

slide 28

[slide 28]

(Gabriel:)
RESTBase as of now reduces compressed enwiki Parsoid HTML from 160G to 100G by storing data-parsoid on the server
can improve things further on template-heavy pages like Obama: from 3.5MB to current mobile HTML size - 950kb (https://phabricator.wikimedia.org/T78676)
microcontributions should be really fast (ideas at https://phabricator.wikimedia.org/T87556)
HTML rewrite needs for apps: e.g. they want to move infobox around
Demand from third party users

Metrics & other key accomplishments[edit]

slide 30

[slide 30]

RESTbase latency
Toby: this is out of cache probably?
Damon: this looks pretty good

slide 31

[slide 31]

Gabriel: test coverage
Damon: I like that

What's next[edit]

(Gabriel:)

slide 33

[slide 33]

slide 34

[slide 34]

pretty much ready for mid-Feb deploy, waiting for hardware
section editing can also speed up VE by cutting down save POST to edited section(s) only & serializing only that section instead of entire DOM
Ori: it's still a bit further away, need to address other tasks first
Roan: (wary about implementing new methods)
Gabriel: section editing *API* has dual benefit of enabling micro-contribution experimentation on mobile & faster editing in VE

Asks[edit]

(Gabriel:)
need more bridging between Ops and dev. had to turn candidates with strong DevOps skills away
Ori: one takeway from this q: should not discount internal architecture expertise, see example of (Cassandra?) RfC re databases
Gabriel: disagree about that example
Ori: more involvement of Ops/architecture input from beginning would mean less blockage now. There a NIH syndrome, rejecting existing solutions
Gabriel: actually Ops people like Sean agree with [move to Ccassandra], just don't have time
Ori: fear we will have same issue with monitoring
ErikM: don't entirely agree with Ori ...
(more general discussion about architecture leadership structure)
ErikM: distribution..?
Greg: need consistent release (learn our lessons from MW releases, the more different they are from what WMF uses, the less they are supported), e.g. does releasing by images make sense? This is tied in with how we do deployments for services (containers?); complicated
ErikM: (question about VE blocker re template editing)
JamesF, Subbu: not a blocker
Roan, Erik: ...
Ori: as the person responsible for VE performance, I am willing to trust on RESTbase being available, if you are confident. but could also look into alternatives
ErikM: by end of Feb, all revs available?
Gabriel: not all in storage right away (old ones will be generated on demand), but all since turn on date
have resources, 6TB replicated storage provisioned (18T total, 3-way replication)
Roan: on Labs currently?
Gabriel: actually already in production, but only on three 250G boxes
Roan: cool, then I propose to start VE testing right now
Ori: worry about Services team not meeting deadline without own fault due to external factors
Damon: that's a normal situation actually
Gabriel: in worst case you can start testing using the current test boxes
Ori: OK