Wikimedia monthly activities meetings/Quarterly reviews/Discovery, January 2016

From Meta, a Wikimedia project coordination wiki

Notes from the Quarterly Review meeting with the Wikimedia Foundation's Discovery team, January 21, 2016, 8:45 - 9:30 AM PT.

Please keep in mind that these minutes are mostly a rough paraphrase of what was said at the meeting, rather than a source of authoritative information. Consider referring to the presentation slides, blog posts, press releases and other official material

Presentation slides from the meeting

Discovery[edit]

slide 1

Slide 1 (KPIs)[edit]

KPIS

  • User satisfaction - doubled in Q2 (15-28%)
  • Doubling result of implementation of stats change
  • Zero results rate - (33-26%) -normal variance
  • Began collecting last fiscal year - still collecting comparative data
slide 2

Slide 2 (Search language detection)[edit]

  • Ran an a/b test to to adjust search queries based on detected language
  • Quantitate analysis showed that people search in non english languages on english wikipedia
  • Surveys showed the users actively do this
  • Ran A/B test but the impact was not statistically significant
  • Language detecting library was not good at detecting short queries
  • Lila: Are we looking at other language detection options?
  • Yes, we have a couple of leads on new libraries that could be used
  • Lila: We could prompt the user if we're not sure of the language
  • Lila: How common is this problem?
  • On enwiki, about 7% of queries that get no results are not in English
  • Generally, each change we make seem to have a large impact for a small number of people
  • Making the experience better is a LOT of small changes
  • Lila: ?
  • We could index all wikis in one place, but would need more hardware
  • Wes: How many A/B tests were run?
  • Ran one test for this specific goal
  • Since the A/B tests were disappointed the team re-aligned to the Completion suggester
  • Dropped zero results by 10%
  • Feedback has been very positive
  • This was an absolute minimum product, so can't become default yet
  • Wes: Will completion suggester become default this quarter?
  • Yes. Incremental rollout.
  • Should improve zero results; earlier tests showed maybe 10%
  • Hopefully would improve satisfaction, but we don't have qualitative evidence on that yet
  • Based on user feedback, we are optimistic
slide 3

Slide 3 (Portal)[edit]

  • Measure the usage of the portal, performs ab test, decrease time users spend searching on page
  • Portal has been migrated to git/gerrit
  • Used to be a static page with some css and js on meta wiki and a script named extract2.php that hit the api and pushed out the page
  • Lila: Is this typical for other MediaWiki pages?
  • Yes, version control is not common for pages like this
  • We engaged with primary maintainer (mxn) who makes 90%+ of edits to the portal
  • He was very supportive of the migration
  • Other (mostly uninvolved) people objected; took a couple months to work through that
  • We did launch the first A/B test right before the end of the quarter, but too late to hit this goal
  • Early results are showing a 5% increase in conversion with no loss of interaction
  • Lila: What is 5% in actual pages?
    • ___ million per week
  • Wes: Be sure to specific bot/non-bot traffic
  • In this case, it's virtual all non-bot
  • After migrating to gerrit, we were able to add instrumentation
slide 4

Slide 4 (improve satisfaction metric)[edit]

  • Iterate on user satisfaction by running a QuickSurvey
  • Lila: How did we create the User Satisfaction metric? Is it industry standard?
  • Yes, Google uses something similar
  • Planned to run a survey on the result page they ended up on, asking if they were satisfied, and compare with our quantitative satisfaction data
  • Quick survey was delayed, and then the deployment freeze and fundraiser cut into the available time to do a survey like this
  • Lila: If it is an industry standard then lets make sure to benchmark against it
  • Wes: Quick survey is being used by multiple teams
  • We had a couple extra requirements. It just wasn't ready
  • Lila: When will we know?
  • After we have several weeks of data, so might not have answers until Q4
  • Lila: On all these slides, we're not showing impact as clearly as we should. To align with FDC format, the template should bring out impact more clearly
  • Wes: Trevor has started a page to explore different formats to improve the template
  • Tomasz: Distinguish between outputs and outcomes
  • Greg (via Etherpad): That sounds good - let's connect offline to discuss possible revisions before next QRs
slide 5

Slide 5 (evalutate maps/wdqs)[edit]

  • We had done beta-level deployments, so wanted to review user feedback
  • Maps was rolled out to some wikivoyage; got good feedback
  • We made some improvements based on feedback
  • Both WDQS and maps: Request rate spiked on launch and now has normalized to a normal level that is growing organically
  • We have <1 engineer on WDQS and <1 on maps
  • Android app "nearby" was a flat list; now is a map using our tile server
  • Lila: No map for San Diego. Are we going to have a bot add maps?
  • Any place that already had a map got the new one automatically. Adding new maps is a manual process
  • We have prototyped adding maps via Visual Editor, interactively/visually (Thanks to Ed Sanders, who did prototyping before a )
  • Lila: How are we measuring success/impact?
  • We measure number of tiles served, and number of users who see tiles (pageviews on wikivoyage, android)
  • Eventually we want to measure discovery--getting from a map to somewhere else. Technically challenges to that.
  • Wes: What about page performance?
  • Lila: It needs to be an automatic way of cross-connecting projects
  • WDQS: Some of the most prominent consumers are planning to switch to our service.
  • We plan to continue to suppor
  • We will upgrade to blazegraph 2.0 when released (this quarter)
  • Add geospatial searches
  • Lila: Who are the main consumers of WDQS?
  • Mostly previous users of Magnus's WQS which was on labs
  • Our initial goal was to move these users, and they have
  • We are enabling people to build more tools on top of this
  • WDQS has beta-level support, which is a step up from labs
  • Lila: Are we looking at adding write features
  • Discovery is not; Wikidata folks are looking into it
  • Lila: Will we get wikidata folks into this process?
  • Wes: That is the intent
slide 6

Slide 6 (referrer metrics)[edit]

  • We have a dashboard for referrer traffic (Google, Bing, etc)
  • Lila: I was told that 2/3 was from google, which is different from this graph
  • Mental note for Greg: Bring attn to that stat in public display
  • Lila: Interesting trends, like google holiday dip
  • Sylvia: Who are the "non-search"?
  • Includes clicks on internal links; any direct link from anywhere on the web
  • Lila
  • Can we separate internal page views?
  • Getting to wikipedia direct vs article queries (clicks from within vs. from without)
  • Wes: ? (missed what he said that Sylvia +1'd)
  • We got a lot done, but didn't quite complete the goal as written
  • Lila: Want to follow up with more questions later
  • Tomasz: Shifting our analyst was the right choice, but was painful. Threw off our planning. Consider as a systemic issue moving forward.
  • Wes: Yes, we're aware and working on it
  • Kudos to Dario for writing up a rationale, and sharing appropriately. Process went well.
  • Lila: Not just looking at wikipedia; looking at all projects
  • Non-wp projects may be hard to find. Maybe that explains the difference above related to "2/3 of traffic"
  • Lila: Can we surface commons, sources, etc. at a much higher rate? It's coming up in the strategy consultation too
  • Yup
  • Lila: Goal is not to cannibalize any project
slide 7

Slide 7 (portal migration)[edit]

  • Mostly delayed due to un-involved users
  • mxn corrected this by saying that he was involved
  • 2 factors exacerbated the problem
  • No CL support for Discovery
  • CL hired
  • not enough product support
  • PM hired
  • In the end, nobody objected. But it took 2+ months.
  • Met with wikidata folks in Germany
  • Wikidata folks are pushing ahead, with our support
  • Trip was incredibly worthwhile
slide 8

Slide 8 (improving analysis)[edit]

We now have a standard set of metrics and reporting that PM's can use thanks to our Data Analysts work

  • Wes: Editing is also using Discovery's A/B testing processes?
  • They are at least interested
  • Hired our PM first working day of Q3
  • Dedicated Discovery CL starts next week
slide 9

Slide 9[edit]

slide 10

Slide 10[edit]

  • Completion suggester is still fundamentally a prefix search using elastic, but compensates for small errors like two typos

Appendix (portal screenshots)[edit]

  • Shows an A/B/C test we ran on the portal
  • "This draws people to the search box, but the search feature isn't great"
  • "This improves the search process"
  • We plan to push the improved search dropdown to production
  • Wes: When presenting results, distinguish between desktop and mobile
  • Apps team did qualitative surveys, and people liked images
  • Lila: We should create lists of missing images that community could help fill in
  • Lila: Are the descriptions coming from wikidata? (top 100, top 1000)
  • Yes, good idea.
  • Quim: Interesting idea. We'll talk.
  • Lila: Rollout plans?
  • We have plans for a lot of potential changes
  • By the end of this quarter, we would like to push this for everyone. Will bring desktop experience closer to mobile. 5% improvement is substantial (12% of people who left the page without doing anything; millions of people)
  • Lila: Could we prompt users if they search in the wrong language?
  • We are looking to improve the existing language picker
  • Lila: Does anyone click on the language links around the globe?
  • About 10%
  • Placement of languages around the globe is known not to be great (mxn doesn't like it)
  • There is a reason they are arranged as they are, but could be better
  • Lila: Need to dig deeper into SEO/referrals, drive more, awareness
  • Wes: we are working with partnerships to understand the full funnel
  • Lila: DIscovery's role is not clear. Make mission clear; improve messaging
  • Lila: Recommend sending a monthly update to wikimedia-l (or wherever is appropriate)