Wikimedia Foundation metrics and activities meetings/Quarterly reviews/Editing, January 2016
Please keep in mind that these minutes are mostly a rough paraphrase of what was said at the meeting, rather than a source of authoritative information. Consider referring to the presentation slides, blog posts, press releases and other official material
Attendees: Neil Quinn, Trevor Parscal, Leila Zia, Guillaume Paumier, James Forrester, Roan Kattouw, Tomasz Finc, Lila Tretikov, Joe, Joel Aufrecht + many people in BlueJeans
- 1 Introduction
- 2 Collaboration Team
- 3 Language Team
- 4 Multimedia team
- 5 Parsing team
- 6 VisualEditor team
- 7 Other comments
Last quarter: Numbers a little up; more details later
Neil: Monthly active editors (our key metric)
pretty flat over the past 6 years
Neil: New active editors (registered and made 5 edits). New user intake. Trend is mostly flat as well. Dropoff in the past 6 months/
Percentage change over the past 5 years. New active editors (red) generally mirrors the active editors but more strongly
Further context: new active editors with a comparable active editors number. Previous slide: not comparable because we use different methods.
Lila: What changes (up or down) could explain these fluctuations? What are the hypotheses? When are you going to investigate?
Neil: No hypotheses at the moment. In the coming months, I'm not sure how much time I'll have to dive into this because I'll be preparing for the big VisualEditor A/B test.
Trevor: What's really important is to understand the difference between what we did and environmental factors. It's hard for us to point to anything we've done.
Lila: At the minimum we can slice by source, user agent, to get some more solution about what could explain.
Trevor: We're starting to collect more data. As you start measuring more and more accurately, you start to discover more patterns. These numbers are similar to what we had in 2011 and 2012.
JamesF: In 2011 for example we didn't have Wikidata, so the scope of changes
Lila: When you look at a graph like that, it's such a red flag that you want to dig in, to find the biggest driver.
Tomasz: There are also pretty significant environmental factors and we don't necessarily know what those are.
JamesF: In response to Geoff's question in chat about hypothesis: I don't have an hypothesis at the moment.
(some of this was related to slide 4)
Trevor: We can also get those numbers more regularly and in more automated fashion.
JamesF: and not just to lower our workload, but also to show to our community, because we're partners in this
Lila: Once you identify the cause, you want to design an intervention
Tomasz: Discovery has had a platform for dashboards for a while, can help if needed
Lila: Is there a correlation with forced login on apps/mobile? SSL?
JamesF: That theoretically shouldn't change those numbers. re: ssl: I'm afraid we're using that as a boogeyman and missing some other change at that time that would explain many changes we've seen across metrics around that time
Neil: Edits to Wikipedia articles: also flat over the past 2 years
Neil: Edits from mobile increasing slightly
Roan: First goal cross-wiki notifications: done. Now available on test wikis (test.wp.o and test2.wp.o)
This is the kind of feature where we expect that we'll get a lot of feedback once humans use it
Lila: Do we have target for being out of beta in the coming quarter on any wikis?
Roan: Hopefully by the end of the quarter, but better to do it right than do it fast.
Lila: Is there an API for this? e.g. to send a message to all wikis. e.g. replacement for CentralNotice.
Roan: we haven't looked at this for this quarter; focused on existing notifications. Worried about creating a channel for anonymous harassment.
Trevor: The source of notifications has to be public
JamesF: We have MassMessage to send messages to users' talk pages. We need to be careful about avoiding notifications fatigue
Lila: Worth opening a conversation with the community about how to mitigate that.
Tomasz: What about notifications on mobile?
Roan: One issue is that the mobile site implemented notifications from scratch. Which means we're not breaking it but it's not getting anything new.
Lila: I'd like us to think about mobile first, then desktop. Because if it works on mobile...
Trevor: In this case, we have a preexisting implementation that we're building on top of.
Wes: Do you have a CL working on this? What's the reception like?
Roan: Nick Wilson. This hasn't been deployed to real wikis yet.
Lila: My thoughts about mobile first are more an overarching theme for our work.
Trevor: We need to unify the mobile and desktop tools
Pau: Notifications on desktop happen in a small area; on mobile they would take the full view.
Lila: Progress on Flow this quarter has been impressive - thank you.
Roan: Spam attack and unique issues with Chinese Wikipedia and target nations for rollout
Roan: Did do research, but did not achieve goal of finishing research
Joe: Next round of research now planned
Roan: For example, old revert icon was not very understandable
Amir: Team is all distributed, no one in the US. Working mainly on content translation
Amir: Numbers look good in terms of returning translators. Translation suggestions have been a contributing factor. See exact numbers in Appendix slide
Also experimented with topical suggestions. Worked with Doc James on lists of medical articles for English to Persian. Will be more of that in next quarters
Lila: What's the delta between before/after recommendations went live, in percentages?
Amir: 4300 unique users in previous quarter, 5000 this quarter. (So about 16% up.)
Lila: If your measure of success is # of translations per user: On a per-user basis, how many more translations are happening with the recommendation tool?
Amir: Graph in appendix. Reduction in number of people who make only one translation.
Leila: I Don't know the number for the measure of success. But: Around 16% of translations that are being started are started from suggestions. It is gradually increasing (we don't know why yet). Pau did some experiment with UX. Total translations are increasing, and the share coming from suggestions is increasing as well.
Lila: Make sure we focus on our measure of success when presenting.
Runa: We have this measure in the Appendix. We were tracking it during the quarter. Because of the increase in # of users, and reduction of users who do only 1 translation, actual % data isn't very high (maybe 1-2% increase).
Santhosh: uniform architecture speeds maintenance, analytics, etc. By December, we were migrated to service runner.
Amir: Some complaints from users about losing translations, inability to publish translations. We researched that. We started logging all the failures for translation, saving, restoring, publishing.
We fixed some bandwidth issues that were preventing users from publishing an article, and we saw a reduction in the number of failures.
The graph shows all publishing failures in blue, and the red line displays publishing failures excluding the ones that were caused due to AbuseFilter
AbuseFilter failures are now displayed with an error message that helps the translator identify the problem
Going forward we want to handle these errors more gracefully so that the core Content Translation workflow for a user is not interrupted heavily
Also connects to the feature improvements mentioned in the next slide.
Amir: We want to be part of the wider translation technology community. We rely on other services for machine translation. Contributing back parallel corpora benefits us by helping machine translation tools improve.
Storing the source text and the translation in an accessible way (paragraph by paragraph) so that machine translation can determine how good the translation was.
Ready to be deployed (DBA schedule)
Lila: We won't be using this ourselves?
Amir: The partners will be using this for their machine learning. We're also using the same infrastructure to store translations for ourselves
Tomasz: Who are the partners?
Amir: Apertium (OSS project), Yandex (deployed in November; Russian company). Probably more in the future.
Trevor: Working with Sheree Chang on more partnerships.
Amir: Listening carefully to feedback every day (notably on unnecessary/ugly syntax). Notably planning to migrate the editing component of CX to use the visual editor. This should get rid of some of the syntax errors.
Amir: Slide says 48000 new articles, we now have 49000.
Translations on the Russian Wikipedia doubled after we enabled Yandex machine translation.
The migration to sevice-runner should help avoid outages in the future, due to better stability and monitoring
Joe: What does graph at bottom show?
Amir: Weekly translations in Russian.
Lila: The growth looks really healthy and good. I expect that to be the trend for new languages you enable machine translation for.
Amir: We're going to enable several more language pairs in the current quarter.
Runa: We've received requests from communities that were watching the development on the Russian Wikipedia.
Lila: Thanks to everyone involved. Some notes (everything I say here is a bit of a brainstorm):
- It's still hard to find (beta feature). How to figure out that a user (maybe a reader) is bilingual.
- Think mobile (x5). Especially in languages of countries where people are mostly using mobile
- Talk to other people working on Global South issues as well.
JamesF: cross-wiki upload from any wiki to Commons, and then insert it into the article the user is editing.
Now used for about 1000 files a day uploaded to Commons (~10% of all new files)
Mark: Average 527 unique users per day
JamesF: Some issues, which I'll discuss later, because the tool is now more openly available. Deletion rate is about the same for the same type of user on other tools.
James: Interested in investigating more
James: Interested in investigating whether, for example, alongside/instead of Wikipedia articles readers could want something more like slideshows. Didn't get any interested academic collaborators.
Lila: Keeping goals for next quarter?
James: Yes; bot primary goal, but we're still very interested.
Guillaume (etherpad note): Working on this with J-Mo this quarter.
Lila: Why are the deletions happening?
James: Because the images are out of scope, or copyright violations, or otherwise not appropriate.
James: Ran test over Christmas for several weeks (2 weeks actually, I think). Result of the A/B test: pretty much same levels of uploads and deletion, so we went back to the original and are thinking about what improvements we can make next.
It's a non-trivial addition to the community's workload and we don't want to increase that.
About half of the files uploaded by new users (whether using UploadWizard or cross-wiki upload tool) but because of the tool being more prominent, many more of its users are first time uploaders
Lila: Additional features for the tool (e.g. bulk uploads adding categories)?
James: Additional features could turn it into a monster.
Trevor: More people uploading images is a good thing. It does add more pressure on the community because there's more patrolling to do, but it's a good thing.
James: The feedback loop for uploaders will be better with cross-wiki notifications.
Mark Holmquist: Rough numbers: New uploaders from the beginning of time on Commons (top line), en.wikipedia, and de.wikipedia (bottom).
Getting a lot more uploaders in January based on the cross-wiki upload tool
Lila: Top line is cumulative?
James: Per day.
Lila: Impressive graph.
Mark: All uploaders in a month. Similar to the previous graph.
Lila: Do we know why there's so much spikiness?
Guillaume, Mark: Wiki Loves Monuments, Wiki Loves Earth, mass uploads from museums and other institutions
Mark: All graphs are monthly data.
Lila: Progress looks great. Question about annotations and metadata. We're getting all these amazing images on Commons, but they're not surfaced in our tools. Semantic data surrounding image is going to be important. Are we thinking about that piece?
Trevor: We've made a lot of progress in structured data for Commons. It's probably the project that has the most structured data.
James: We're working with the Wikidata team on this. They're working on it, among other things. It's a huge change from the community's standpoint.
Lila: The community would also have to retroactively add data for old images.
James: Currently, categories are multilingual. Need to make it so it works in all languages, not just English and German.
Trevor: That's why Wikidata is so important.
James: Search team and Multimedia team have to discuss who will work on improving media search.
Lila: I'm talking more about tagging images so they can be found by machines, not searched for by humans. Longer-term issue.
Tomasz: Mark and I talked about this a bit at the dev summit
Subbu: Improve editability of sections that contain content like gallery tags, currently only editable in wikitext editor.
Subbu: Tidy: cleans up the HTML that the PHP parser creates. Replacement of Tidy has been a long-standing request. We started working on this in Q1.
Trevor: This is part of a strategy we have to help the community to self-fix a lot of the crazy things they've been doing with templates. Some templates start tags thay they don't close (unbalanced templates). We can't require templates to be balanced because that would break everything. This is a step in working with the community to improve the corpus of templates.
James: One of the things we want to achieve is partial editing—which mobile really needs. We could build a huge amount of tech to make it possible without changes to templates, but it still wouldn't be a great solution for the user. Better to solve it like this. Paying down tech debt in the content, rather than the software.
Lila: I think this is extremely important. Also a social goal. It's not really fair for the Parsing team to own all of this.
Trevor: First, we need a way to allow users to specify if a template should be balanced or not, and start enforcing it. Then we can make a worklist of templates to make balanced.
James: On the social side: the responsibility is mostly on the head of template issues; while the benefits are mostly for template users.
C. Scott: ... Long term we need a community push... But there's a lot we can do by addressing commonly used templates like Infobox (which already emit balanced HTML) which as Tim noted is used in a large number of pages.
Lila: Number 1 priority is to understand where the biggest value here is coming from. Let me know how I can help.
Subbu: This quarter, Scott has been working (e.g. discussions at dev summit) on this, so we expect to make more progress this quarter.
[ skipped remaining parsing slides because of shortage of time ]
James: Parsing team has been a huge support to VE for a long time. Thank you so much.
James: Numbers slightly shifting up. New editors get VE by default. Most new editors leave, but those who stay continue to use VE.
Trevor: Both Thalia and Frederic were Outreachy/GSoC students whom we've retained. It's one of our best ways of getting new talent. Two people early in their careers were able to create VE features: this speaks to the extensibility of the VE platform/infrastructure.
Lila: Is there going to be a library of tools or something?
Trevor: Will probably need to do that soon.
Lila: particularly because community members will probably start wanting to build their own extensions.
James: Also added editing and syntax highlighting of sheet music, because hey, why not?
Leila: The Gadget for link rec is ready, minus its UI. Tests need to come after the UI component of the gadget is completed.
James: Not just providing a dumb box, but guiding the user through the steps.
Tomasz: mobile readers now dominating desktop readers
Lila: ... It's fun to see where VE is evolving, and where you can experiment.
Trevor: We really set up VE as a foundational technology. We've put VE into Flow, and in the future it'll be in CX. We're trying to coalesce around these general solutions instead of one-off things.
Lila: Very smart. I'm in complete agreement. But it makes it that much more important that we get mobile figured out.
Lila: Really good update. Thank you. Every time we meet we make progress on understanding...
James: Yes. Would be nice to have fewer red goals?
Lila: Don't worry about having fewer red goals. You should be pushing yourself. Maybe fewer goals overall.
Trevor: Yes. It'll be a bit different this quarter. Just focus areas.
Lila: You want to understand why certain things are happening. Dig into data.