Grants:IEG/Revision scoring as a service/Midpoint

From Meta, a Wikimedia project coordination wiki


Welcome to this project's midpoint report! This report shares progress and learnings from the Individual Engagement Grantee's first 3 months.

Summary[edit]

In a few short sentences or bullet points, give the main highlights of what happened with your project so far.

  • We've built a generalized scoring system for revisions/pages
  • We've trained models that compare with the state-of-the-art for precision/accuracy
  • We've documented our project and posted about it in three different major Wikipedian news sources
  • We've begun work on a revision coding service.

Generally, we're on or ahead of schedule, but we have had to shift around some of our deliverables (revision coding service was supposed to be done, but we stood the scoring service up on labs instead).

Methods and activities[edit]

So far, the majority of our activities have been focused on the development of a generalized scoring system for revisions and pages in a MediaWiki installation -- with a focus on Wikipedia. We have implemented state-of-the-art, revert predicting machine learning models for English, Portuguese, Turkish and Azerbaijani Wikipedia. These models and their scores are accessible via an instance on Wikimedia Labs. For example, see http://ores.wmflabs.org/scores/enwiki?models=reverted&revids=4567890|4567892

{
  "4567890": {
    "reverted": {
      "prediction": false,
      "probability": {
        "false": 0.6967103285095867,
        "true": 0.3032896714904134
      }
    }
  },
  "4567892": {
    "reverted": {
      "prediction": false,
      "probability": {
        "false": 0.5479798477219674,
        "true": 0.45202015227803266
      }
    }
  }
}

In parallel to this development work, we have been building documentation and specification for our next stage of work. For an overview of our documentation, see Research:Revision scoring as a service. For information about our sub-projects, see also:

Finally, we have engaged in public outreach about the project. We presented on the project at the Wikimedia Foundation's metrics meeting in January. We also ran articles on the project in the English Wikipedia signpost en:Wikipedia:Wikipedia Signpost/2015-02-18/Special report, the Portuguese Wikipedia technical Village Pump pt:Wikipédia:Café dos programadores#Serviço de pontuação de edições and the Persian Wikipedia. We're currently in the process of producing a translation for the Turkish Wikipedia signpost as well.

Midpoint outcomes[edit]

Mockup of revision coder interface

Our primary goal of constructing high signal damage scorers has been achieved. We were also able to get the scoring service hosted via a web API on Wikimedia Labs -- which was one of our end-goals. However, we were not able to stand up a revision coder service as planned, so we were forced to push that goal back to our final deliverables. Despite this setback, we've made substantial progress iterating on designs and discussing the technical considerations.

Finances[edit]

We have spent our funds as planned and have not requested or received additional resources.

Learning[edit]

We have been following a light SCRUM process with weekly sprints and that has been serving us very well. If anything isn't going well, it's that we sometimes allow weekly reports to fall behind by a week. Since we are using Trello to manage our work, we haven't had any trouble "remembering" what was done in past weeks.

What are the challenges[edit]

  • Timezones are difficult, but we make due very well.
  • It took a little bit of time to get everyone up to speed and submitting pull request to the central repositories.

What is working well[edit]

Next steps and opportunities[edit]

  • We'll be constructing a revision coder service on Labs to which the revision handcoder gadget will communicate
  • We'll be constructing new classifiers based on human assessment obtained by means of the revision coder.
  • We are investigating the possibility to include Farsi in our language library.

Grantee reflection[edit]

  • It's difficult to manage a full time job and volunteer work on a project like this that has deadlines and documentation expectations. Luckily, it seems that my work on this project has been sanctioned during office hours. Regretfully, none of my other responsibilities have been reduced, so I still end up working on this in the evenings and weekends. Otherwise, work on this project has been a true joy. :) --EpochFail (talk) 00:53, 12 March 2015 (UTC)
  • This has been something I had been looking for since 2012 when I first attempted developing AI tools for Wikipedia. Back then I had buy a spare drive to download something like 550Gbs worth of compressed dumps and hit my monthly internet limit twice in a row just to be able to download. I also had data corruption issues with this but I was able to recover from that with the aid of apergos (kudos to her again). This project intends to eliminate such difficulties researchers face when dealing with the massive size of Wikipedia and it has been most pleasant to work on - not just at an individual level but as a group as well which makes this a blast for me. -- とある白い猫 chi? 12:02, 20 March 2015 (UTC)
  • It is really cool to work with people with other experiences and backgrounds, and collaborate in code development for a project like this. I'm having the chance to put in practice the machine learning theory I was learning last semester. Also, code review is providing a unique opportunity for me to become a better Python programmer. Helder 19:14, 20 March 2015 (UTC)