Wikimedia monthly activities meetings/Quarterly reviews/Discovery, January 2016
Please keep in mind that these minutes are mostly a rough paraphrase of what was said at the meeting, rather than a source of authoritative information. Consider referring to the presentation slides, blog posts, press releases and other official material
Slide 1 (KPIs)
- User satisfaction - doubled in Q2 (15-28%)
- Doubling result of implementation of stats change
- Zero results rate - (33-26%) -normal variance
- Began collecting last fiscal year - still collecting comparative data
Slide 2 (Search language detection)
- Ran an a/b test to to adjust search queries based on detected language
- Quantitate analysis showed that people search in non english languages on english wikipedia
- Surveys showed the users actively do this
- Ran A/B test but the impact was not statistically significant
- Language detecting library was not good at detecting short queries
- Lila: Are we looking at other language detection options?
- Yes, we have a couple of leads on new libraries that could be used
- Lila: We could prompt the user if we're not sure of the language
- Lila: How common is this problem?
- On enwiki, about 7% of queries that get no results are not in English
- Generally, each change we make seem to have a large impact for a small number of people
- Making the experience better is a LOT of small changes
- Lila: ?
- We could index all wikis in one place, but would need more hardware
- Wes: How many A/B tests were run?
- Ran one test for this specific goal
- Since the A/B tests were disappointed the team re-aligned to the Completion suggester
- Dropped zero results by 10%
- Feedback has been very positive
- This was an absolute minimum product, so can't become default yet
- Wes: Will completion suggester become default this quarter?
- Yes. Incremental rollout.
- Should improve zero results; earlier tests showed maybe 10%
- Hopefully would improve satisfaction, but we don't have qualitative evidence on that yet
- Based on user feedback, we are optimistic
Slide 3 (Portal)
- Measure the usage of the portal, performs ab test, decrease time users spend searching on page
- Portal has been migrated to git/gerrit
- Used to be a static page with some css and js on meta wiki and a script named extract2.php that hit the api and pushed out the page
- Lila: Is this typical for other MediaWiki pages?
- Yes, version control is not common for pages like this
- We engaged with primary maintainer (mxn) who makes 90%+ of edits to the portal
- He was very supportive of the migration
- Other (mostly uninvolved) people objected; took a couple months to work through that
- We did launch the first A/B test right before the end of the quarter, but too late to hit this goal
- Early results are showing a 5% increase in conversion with no loss of interaction
- Lila: What is 5% in actual pages?
- ___ million per week
- Wes: Be sure to specific bot/non-bot traffic
- In this case, it's virtual all non-bot
- After migrating to gerrit, we were able to add instrumentation
Slide 4 (improve satisfaction metric)
- Iterate on user satisfaction by running a QuickSurvey
- Lila: How did we create the User Satisfaction metric? Is it industry standard?
- Yes, Google uses something similar
- Planned to run a survey on the result page they ended up on, asking if they were satisfied, and compare with our quantitative satisfaction data
- Quick survey was delayed, and then the deployment freeze and fundraiser cut into the available time to do a survey like this
- Lila: If it is an industry standard then lets make sure to benchmark against it
- Wes: Quick survey is being used by multiple teams
- We had a couple extra requirements. It just wasn't ready
- Lila: When will we know?
- After we have several weeks of data, so might not have answers until Q4
- Lila: On all these slides, we're not showing impact as clearly as we should. To align with FDC format, the template should bring out impact more clearly
- Wes: Trevor has started a page to explore different formats to improve the template
- Tomasz: Distinguish between outputs and outcomes
- Greg (via Etherpad): That sounds good - let's connect offline to discuss possible revisions before next QRs
Slide 5 (evalutate maps/wdqs)
- We had done beta-level deployments, so wanted to review user feedback
- Maps was rolled out to some wikivoyage; got good feedback
- We made some improvements based on feedback
- Both WDQS and maps: Request rate spiked on launch and now has normalized to a normal level that is growing organically
- We have <1 engineer on WDQS and <1 on maps
- Android app "nearby" was a flat list; now is a map using our tile server
- Lila: No map for San Diego. Are we going to have a bot add maps?
- Any place that already had a map got the new one automatically. Adding new maps is a manual process
- We have prototyped adding maps via Visual Editor, interactively/visually (Thanks to Ed Sanders, who did prototyping before a )
- Lila: How are we measuring success/impact?
- We measure number of tiles served, and number of users who see tiles (pageviews on wikivoyage, android)
- Eventually we want to measure discovery--getting from a map to somewhere else. Technically challenges to that.
- Wes: What about page performance?
- Lila: It needs to be an automatic way of cross-connecting projects
- WDQS: Some of the most prominent consumers are planning to switch to our service.
- We plan to continue to suppor
- We will upgrade to blazegraph 2.0 when released (this quarter)
- Add geospatial searches
- Lila: Who are the main consumers of WDQS?
- Mostly previous users of Magnus's WQS which was on labs
- Our initial goal was to move these users, and they have
- We are enabling people to build more tools on top of this
- WDQS has beta-level support, which is a step up from labs
- Lila: Are we looking at adding write features
- Discovery is not; Wikidata folks are looking into it
- Lila: Will we get wikidata folks into this process?
- Wes: That is the intent
Slide 6 (referrer metrics)
- We have a dashboard for referrer traffic (Google, Bing, etc)
- Lila: I was told that 2/3 was from google, which is different from this graph
- Mental note for Greg: Bring attn to that stat in public display
- Lila: Interesting trends, like google holiday dip
- Sylvia: Who are the "non-search"?
- Includes clicks on internal links; any direct link from anywhere on the web
- Can we separate internal page views?
- Getting to wikipedia direct vs article queries (clicks from within vs. from without)
- Wes: ? (missed what he said that Sylvia +1'd)
- We got a lot done, but didn't quite complete the goal as written
- Lila: Want to follow up with more questions later
- Tomasz: Shifting our analyst was the right choice, but was painful. Threw off our planning. Consider as a systemic issue moving forward.
- Wes: Yes, we're aware and working on it
- Kudos to Dario for writing up a rationale, and sharing appropriately. Process went well.
- Lila: Not just looking at wikipedia; looking at all projects
- Non-wp projects may be hard to find. Maybe that explains the difference above related to "2/3 of traffic"
- Lila: Can we surface commons, sources, etc. at a much higher rate? It's coming up in the strategy consultation too
- Lila: Goal is not to cannibalize any project
Slide 7 (portal migration)
- Mostly delayed due to un-involved users
- mxn corrected this by saying that he was involved
- 2 factors exacerbated the problem
- No CL support for Discovery
- CL hired
- not enough product support
- PM hired
- In the end, nobody objected. But it took 2+ months.
- Met with wikidata folks in Germany
- Wikidata folks are pushing ahead, with our support
- Trip was incredibly worthwhile
Slide 8 (improving analysis)
We now have a standard set of metrics and reporting that PM's can use thanks to our Data Analysts work
- Wes: Editing is also using Discovery's A/B testing processes?
- They are at least interested
- Hired our PM first working day of Q3
- Dedicated Discovery CL starts next week
- Completion suggester is still fundamentally a prefix search using elastic, but compensates for small errors like two typos
Appendix (portal screenshots)
- Shows an A/B/C test we ran on the portal
- "This draws people to the search box, but the search feature isn't great"
- "This improves the search process"
- We plan to push the improved search dropdown to production
- Wes: When presenting results, distinguish between desktop and mobile
- Apps team did qualitative surveys, and people liked images
- Lila: We should create lists of missing images that community could help fill in
- Lila: Are the descriptions coming from wikidata? (top 100, top 1000)
- Yes, good idea.
- Quim: Interesting idea. We'll talk.
- Lila: Rollout plans?
- We have plans for a lot of potential changes
- By the end of this quarter, we would like to push this for everyone. Will bring desktop experience closer to mobile. 5% improvement is substantial (12% of people who left the page without doing anything; millions of people)
- Lila: Could we prompt users if they search in the wrong language?
- We are looking to improve the existing language picker
- Lila: Does anyone click on the language links around the globe?
- About 10%
- Placement of languages around the globe is known not to be great (mxn doesn't like it)
- There is a reason they are arranged as they are, but could be better
- Lila: Need to dig deeper into SEO/referrals, drive more, awareness
- Wes: we are working with partnerships to understand the full funnel
- Lila: DIscovery's role is not clear. Make mission clear; improve messaging
- Lila: Recommend sending a monthly update to wikimedia-l (or wherever is appropriate)