Wikimedia monthly activities meetings/Quarterly reviews/Reading and Community Tech, January 2016

Notes from the Quarterly Review meeting with the Wikimedia Foundation's Reading and Community Tech teams, January 20, 2016 10:30 - 11:30 AM PST.

Please keep in mind that these minutes are mostly a rough paraphrase of what was said at the meeting, rather than a source of authoritative information. Consider referring to the presentation slides, blog posts, press releases and other official material

Attendees: Toby Negrin, Tilman Bayer, Leila Zia, Guillaume Paumier, Anne Gomez, Kevin Leduc, Bryan Davis, Stephen Niedzielski, Danny Horn, Zhou Zhou, Nirzar Pangarkar, Adam Baso, Ryan Kaldari, Niharika Kohli, Quim Gil, Joshua Minor, Roan Kattouw, Lila Tretikov, Katherine Maher, Michael Holloway, Volker Eckl, Dario Taraborelli, Trevor Parscal, Nuria Ruiz, Lisa Gruwell

Reading[edit]

Slide 1[edit]

Lila: the +1.3% is from Q1 to Q2?

Toby: Yes. It's nice to see this going in right direction, generally pleased

Generally pleased with this quarter. Pleased to see collab this Q, esp with people outside the dept. None of the reds were trainwrecks

Lila: A lot of people ask me about the reds and worry about getting a red. If you don't have red, you're not learning and not stretching yourself. When you define your goals at the beginning of the quarter, we assume as leadership team that people /teams will need to make changes/adjustments, will have learning for next Q.

Toby: In many places, we pick quality over time in the iron triangle. (<https://en.wikipedia.org/wiki/Project_management_triangle>) Will be assertive about stopping discussion on stats this meeting, if we need another mtg to discuss in detail we can do that.

3 teams: Reading, Community tech, UX standardization. UXS will report next quarter. This will reflect headcount: 75% reading, 25% ??

Slide 2[edit]

Adam: PO for Web. Link preview looks on Android, good feedback overall though some people complained that it was a hindrance.

We're not ready to actually do this. Metrics may not be right to measure engagement. May revisit in the future, when doing hovercards.

Slide 3[edit]

Adam: That was downside, the upside: managed to roll out the Read more feature on mobile web beta and desktop web beta. Desktop web beta wasn't received very well; visual treatment was weird. Not surprising, desktop is different layout from mobile web.

When people saw read more panels on mobile web, they clicked through quite a bit...

Goal: to ship this later this quarter.

Toby: Want to touch on difficulties of bolt-on design on desktop, Nirzar and the designers done great job but need more thinking about how to [design features between mobile/desktop interfaces]
Lila: We'll discuss more offline

Lila: are we measuring the total session length?

Adam: We need to look at some of the last access data

Slide 4[edit]

Adam: Last quarter we discussed API-driven prototype to modernize content loading. Ultimately in the service of loading content faster for users. Did a prototype...

Slide 5[edit]

[video - Joaquin loading the Barack Obama article on a 2G connection in Spain]

It takes quite a while for content to load. Browser is locking up. This is the problem we're trying to solve. Discussed this at dev summit, got some feedback. Not going to use single page app approach, going instead to take incremental approach, lazy loading, do some hacks around how we load images.

plan is to address this in mobile web beta this quarter

Lila: Why did you decide not to go with SPA?

Adam: Feedback was that going with the SPA was a bit risky, let's try something a bit simpler, and if we want to implement SPA in the future we still have that option.

Lila: Do you know that the new (distributed) wiki that Ward is building is an SPA?

Toby: Two issues, ...Had envisioned single service layer (next-generation), but ...

If users want to browse our content using the apps, might make sense to take that architectural step.

Dario: I started a discussion with Ori about performance. There might be some interesting research in that area.

Slide 6[edit]

Adam: side-by-side comparison with prototype on Labs host (lower-power virtual machine)

Lila: playing with chrome and all that kind of stuff?

Adam: goal is to have look and feel very similar to the site but do it like a SPA.

Toby: We can make this responsive.

Lila: Do you know the performance improvements between the two?

Adam: Barack Obama page: 60 seconds for 1st paint; right side: ~5 secs [Note: SPA available at http://reading-web-research.wmflabs.org/]

Lila: What's your yardstick with the target

Adam: have to check exact #s but 90th percentile to achieve first paint in 5s or less. Using more lazy loading, use scroll behavior as an indicator of when to load things

Lila: it's going to be an incredible achievement, but if you look at industry practices, ~1 sec is okay, less than that is better. blink of an eye, <200ms is best of class

Slide 7[edit]

Josh: Goal last Quarter was to release major overhaul, was a pretty aggressive goal given scope of changes and that I was just starting. We missed the goal but we made it into the public beta channel, finished vast majority of feature work planned.

Over 1,100 in beta test right now (>40 responses on email as of this morning) Most feedback v. positive on visual design, feed concept. Some concerns on flow between feeds, lists, articles, working on feedback from that.

Prioritized quality over schedule. Reviews are really important, have to focus on feedback.

Toby: Working with Katherine's team on a launch strategy. Going with other markets (Canada / NZ) vs going with the US

Stephen (chat): on paper, a miss. on the phone, a huge win! really great work

Slide 8[edit]

Josh: Product goal is to increase user retention, we are able to get users to app but not keep them. Focus to prioritize discovery, casual reading. Use feed mechanism to do that, concept called a hook, dealing with making habit-forming products. Reason for someone to reopen the app and discover something. Reason to come back. Not just look like a web browser packaged in an app.

Lila: Are you seeing improvements in numbers? I'm really excited to hear the way you talk about it. Josh: Yeah, I find myself looking at it voluntarily.

Josh: It's in beta.

Feedback has been pretty positive

Slide 9[edit]

`

Toby: explicit new user experience

Lila: Who's the designer on this?

Josh: Nirzar is primary/lead designer though Kaity has contributed significantly.

Toby: Can we talk about trending articles? One cool thing that shows the power of services and the ability of our team to move quickly is: Josh and I were using the feed, it was done locally based on browsing habits, and we were running out of content, it was getting stale. Talking about ways to find more stuff, knew about Pageview API ___ team built, Monte wired it up over the weekend and now you can find out what's going on all over the world, it's super cool.

Wes: Adding the video to the iOS app page might be a good idea

Lila: What Leila was doing for translation recommendations. Are you guys thinking about registering interests, like if I go down a rabbit hole on something, take me to the subject area?

Josh: The first thing we're looking at is what type of cards people are looking at. Later need to figure out whether to provide this as service layer, or locally on device.

Lila: Possibly social connection

Josh: If one person looks like another person, show them similar topics. But since we respect privacy, don't want to profile users, have to be careful about this approach.

Toby: Josh has used local machine learning algorithms. interesting take on privacy. Might be way of respecting our users privacy while still being able to make recommendations.

Slide 10[edit]

Dmitry: PO for Android

Dmitry: Goal for Android app Q2 more of a soft goal w/o specific #s. Goal is to enrich the presentation and US in the app with more types of content. Strengthen user engagement and retention.

Integrated new maps service in nearby screen, increased engagement as well as maps traffic and hits on the maps tile server.

integrated with Wiktionary: highlight any word and get definition from Wiktionary in streamlined way, made sure to support newer Android version (6)

Contributed to app being featured in Best Apps of 2015 list, a promotion in the app store, led to sizable number of new users.

Pretty aggressive with regard to following latest design guidelines

App is now fully integrated with RESTBase and the mobile content service, though this is a beta feature only for now.

Slide 11[edit]

Dmitry:

Lila: Which particular feature drove this increase, is it Nearby?

Dmitry: Yes

Lila: Is it editable? Can people add annotations to the map?

Dmitry: Not yet

Toby: We want to start thinking about contribution features in a mobile-appropriate way. Started to talk to Trevor's team

Lila: We'll have to be very careful but it's really important.

Toby: The apps are both really starting to leverage services and you can see that in how quickly they are able to ship new features. modern architecture, gratifying to see

Lila: What is the shipping cycle?

Dmitry: Different for Android and iOS. Android is ~once a month for a new production release. Depends on the major feature we're currently working on, we also do some maintenance releases in between.

Josh: for iOS we're kind of in a pause because our revision is so large. Goal is at least once every 6 months.

Adam: for mobile web: technically it gets deployed every week

Slide 12[edit]

Bryan Davis: One of our targets for Q2 was to increase viisibility of measurements of mediwiki API usage. This is missed at this point. The pipeline that the Discovery team built to measure data and get it into Hadoop had some issues. But we do have some results (see Appendix).

Really big news is that action API handles 450 million req/day. Nearly twice the page views handled by English Wikipedia.

GOTO slide 31

Slide 13[edit]

Bryan: Another objective was consider whether to encourage people to migrate API traffic to OAuth. If migration desirable, whether to present some notifications.

Turns out mass migration to OAuth not advisable at this time.

OAuth not really suitable for some API consumers
Authenticated access reduces what we can do for caching on the front-end

Right now no automatic caching process in the API; clients can specify that in the request ("it's ok if the data you give is 5 minutes old")
Purging expired content -- API requests shaped v. differently based on which client asking for which pieces of data, kind of batch oriented. Our current Varnish purging isn't sufficient to clear out content that we know has become stale.

Ops is looking at some updates for the Varnish servers that may allow us to tag individual responses with multiple articles it's related to. "Anything related to article X, drop it from cache."

Slide 14[edit]

Bryan: Our big ticket item for the quarter was to get our Auth manager released to the cluster. Was a project to revamp the authentication structure of MediaWiki, ongoing since Q3 last year.

We made a lot of progress towards it in Q2, but CR turnover, 1.5 FTEs staffing made us not quite make it.

Made a lot of progress, several components released ( OAuth, session management feature, bot passwords feature for bot operators that for one reason or another can't use OAuth).

We're going to continue on this during Q3. We're optimistic we'll be fully live in production by late February.

One thing that will help, the security team has decided to make two-factor authentication a reality for (mediawiki projects?) Maybe more motivated to help us with code review.

Community tech[edit]

Slide 15[edit]

Danny: This quarter was really about establishing credibility with the community.

Results aren't that interesting but it's great to establish a baseline.

People on Russian Wikipedia very happy with the support they're receiving b/c a couple of staff members very engaged there.

Commons is very unhappy with us, feel like an afterthought.

Lila: it makes sense, they do need help

Danny: Wishlist survey Nov/Dec had 634 respondents.

Third work: while we do this survey, do some dev work as well. Goal was to have 3 mid/large side projects this quarter, and we met that goal.

Lila: Was the community happy?

Danny: We've seen positive response in any kind of communication we've been having. We're going to publish a first preliminary status report, and we'll continue to publish those regularly.

Lila: So once a quarter?

Danny: Yes, and also stuff on wiki.

Lila: Even though you get those requests first, doesn't mean your team has to do all of them.

Danny: Yes, absolutely

Slide 16[edit]

Danny: This is the list we're committing to evaluating and responding to. Sometimes our team builds it, sometimes work to be done by other teams,

e.g.

#3 needs shadow namespaces, which Kunal & Mz are working on;
#6, translating Commons categories, will prob be done by Wikidata.

Lila: These are kinds of maintenance tasks that need to be continually run, right?

Danny: We're looking at the pieces we can put in place... A couple different approaches.

Lila:

Toby: To establish credibility, we need to ask what needs to be done, and iterate

Lila: what are you committing to for this quarter? Top 3?

Danny: Wayback machine, pageview stats

Ryan Kaldari: We're also looking into numerical sorting, but we have not decided if we're taking it up this quarter.

Toby: some of those tools not the marquee things, but these are things that the community uses every day.

Slide 17[edit]

Kaldari: We also completed work on long-tail small issues. Example: fixing the citation bot. #1 request from 2014 all our ideas survey; it had actually been completely broken for 3 months (it fully broke in the https-only API change-over).

Complicated piece of software, 3 different interfaces for it. It was completely broken. we fixed the gadget code, fixed the API, made security fixes, simplified the code for easier maintenance, fixed a lot of bugs. The community was very happy about that. Very popular gadget. 11,000 users on English Wikipedia use it.

Slide 18[edit]

Kaldari: Hotcat gadget. Also from allourideas survey (#5). Used on almost all Wikipedia sites. Actually the most-used gadget. Unfortunately, on most of the small WPs, it was broken because of several API changes, gadget loading through RL, scoping changes, so many sources of gadget breakage

We fixed HotCat on over 100 wikis by setting local configuration and having it import hotcat from Commons installation. So it will no longer need to be individually maintained on each wiki. Much better for smaller wikis that dont' have gadget maintainers.

Lila: This is great. As you fix/update this, are you also looking at the UI improvements that could be done?

Kaldari: We looked at a few design ideas to improve the interface for Hotcat, and discussed them with hotcat users, but there was a lot of skepticism about updating the user interface. didn't get buy-in from community, so we decided to keep it as it was for now.

Toby: Danny and I working on getting additional design resource.

Lila: this team is so focused on current, super experienced editors, so we need to be very sensitive about changing UI. But generally, categories are challenging

Slide 19[edit]

Kaldari: Also completed longstanding request for usage statistics on gadgets. Created page for gadget usage stats per wiki.

Lets local admins know which gadgets to maintain, which to retire because they're not being used, gives the Foundation engineering teams an idea of which gadgets might be worth productizing.

Appendix: Strategy[edit]

Slide 20[edit]

Slide 21[edit]

Slide 22[edit]

Toby: We've taken a pillar and assigned it to a team

Slide 23[edit]

Slide 24[edit]

Toby: Of the 6 approaches for reach that the WMF strategy process has identified, we're actually working on 6. That means we also have some data on this as well.

Appendix: Community[edit]

Slide 25[edit]

Toby: Moushira has been awesome.

Slide 26[edit]

Appendix: User understanding[edit]

Slide 27[edit]

Toby: We've had a series of projects to help us get a better understanding of our users

Slide 28[edit]

Toby: Going to have proxy for readers by wiki, country...

Considering a new metric 'MAUD': monthly active users [or?] devices

Lila: We want people coming back. we want that continuous connection.

Toby: Retention is something we have on the apps. I talked to ? about it to have the ? Goal is to have engagement, retention and reach across all of our platforms.

Lila: Do you think you can reach retention goals without having people get user profiles / more identity features?

Toby: I think there's a significant gain from adding content that's refreshed in a more consistent way.

Ethnographic research: see how people approach knowledge in emerging communities

Finally, on analytics side, worked with Analytics to install Pywik, Josh will have a good dashboard. Really making progress on understanding our users.

Slide 29[edit]

Toby: Collaboration with the R&Data team to understand reader motivation
about 1/3 split: fact / overview / in-depth. this is enwiki mobile+desktop. People who are familiar or not with the information

Research is continuing to run this on other projects.

Lila: I love the "bored" category. this is something we want, instead of something something mindless

Appendix: API engagement[edit]

Slide 30[edit]

Slide 31[edit]

Bryan: We don't have a good breakdown atm. Slide 33 has user agent data

Slide 32[edit]

Slide 33[edit]

Bryan: It's recommended to use a specific UA when accessing the MediaWiki API. These are the top 10 UA by volume for December.
5 of those are major web browsers.

When we get the new pipeline in place this will be easier.

Lila: How much of the traffic is internal vs external

Bryan: Majority of activity at Varnish servers is external clients or tools/bots running in Labs. 9% are IP addresses we can associate with WMF cluster

Lila: I wonder how much server capacity this is using.

Toby: I believe little of this is cached.

Bryan: Our current pipeline doesn't give us a lot of detail about what is cached or not. MMV uses caching but type-ahead suggestions don't and aren't really cacheable.

Go back to slide #13

Appendix: Q3 goals[edit]

Slide 34[edit]

Slide 35[edit]

Slide 36[edit]

Slide 37[edit]

Appendix: Health metrics[edit]

Slide 38[edit]

Slide 39[edit]

Tilman: We worked a bit on getting comparison data.
data starts in April because that's when we got new pageview data

Lila: What's the bottom scale?

Tilman: That's just quarters.

Toby: If there's one takeaway from all this, we need to work with fundraising to get mobile fundraising up. We did manage to get some fundraising into the apps.

Slide 40[edit]

Tilman: the mobile singularity has arrived!

Slide 41[edit]

Slide 42[edit]

Slide 43[edit]

Tilman: Android app got quite a boost from being featured on Google Play

Google also made an experiment where they showed app install links next to search results, and we were in a lot of those search results. Now they've switched it off and our installs are diminished.

Lila: biggest concern: our difference between install and uninstalls needs to start increasing. App needs to be essential enough not to uninstall.