Grants:TPS/Ladsgroup/WWW 2017/Report

From Meta, a Wikimedia project coordination wiki

Event name[edit]

WWW conferance 2017

Participant connections[edit]

I had a poster in Wiki workshop in WWW conference which was held in Perth, Australia. That workshop was attended by around 55 people and I explained my poster to at least ten people. Also, during the conference, the poster was put in the dining room and I saw people reading it.


I cover all of my research outcomes and send an email with a link to this page into research-internal or research-l. This way it would reach the biggest number possible.

I have attended in several presentations and I noted down ideas that can be useful for Wikimedia.

  • Now there are some really good NLP tools to measure politeness. Using these tools, there are tons of opportunities to make use of such as finding mean people.
  • We know that there are maps of places of geo-coordinates. It is not as useful as it should be to find places we are lacking enough data in Wikipedia. One measure that has been suggested is the ratio of articles per population in any given area.
  • Google's AI to classify spam email is based on LSTM to output a feature for a simple machine learning classifier. We can use that to boost our efficiency without needing to build PCFG features that increase our memory footprint thus unable to be used in production. They used a "k-dependent Markov chain" which I have no idea what that is but I'm learning.
  • There has been a research on how links in Wikipedia get clicked. It shows only 4% of links gets clicked. Using more depth analysis can help us change our MoS to have only useful links in our articles to reduce the distraction caused by them (especially when page previews feature is around)
  • Some investigations on how sock puppets work have been made (in other social platforms not in Wikipedia) and they are really interesting we use these findings to make sock puppetry harder. In short, People usually make two socks, sock puppets tend to agree with each other (except in 10% of them!). They write shorter sentences. The start fewer discussions but they write more self-centered posts and address others directly. Sock puppets tend to be look alike to each and have a rather distant identity to the puppet master. i.e. Good sock/Bad sock is not common.
  • Using classifier scores to order things is a very common practice. I think we can use that to find imminent problems in Wikidata/Wikipedia. There are some tweaks needs to be made to filter out reviewed ones but I think I can do that.
  • I have been suggested to use bad words of some languages (English, Spanish, etc.) as a predictive signal for scoring edits in Wikidata.
  • "Wikipedia Verification Check: A Chrome Browser Extension" paper builds an API but they are running to technical problems. I suggested them to migrate to Wikimedia Labs. I don't know if they want to do it or not.

Some useful notes I learned during the conference.

  • More than 50% of content of Wikipedia is written by only 0.04% of its users
  • 31% of articles created or edited in May 2014 never visited in June.
  • There is something called second order bias. To simply put, we have our biases and when we build an online collaboration (such as Wikipedia) we perpetuate that bias and when we use that collaboration, it intensifies our biases. (Note: I don't think we ever tried to tackle such bias in Wikimedia environment but it would be a good idea)
  • Now there is an online tool to check if a source is reliable or not based on refutations on statements that have them as a source in Wikipedia. here's the paper.


  • It was all pre-paid by WMF so nothing I guess?

Amount left over[edit]

Anything else[edit]