Grants talk:TPS/User:とある白い猫/Presenting at PAN Lab of CLEF 2011

From Meta, a Wikimedia project coordination wiki

Questions[edit]

Thank you for this request. Before we move forward, we would like you to:

  1. provide a little more detail on the purpose and expected results of your participation
  2. elaborate on the status of "VandalSense" now, and in particular on its standing and significance within the community (as far as you know).
  3. If you can point to some discussion of VandalSense, that would be good too.
  4. be a little more specific on how participation in this event is going to benefit the development of this tool.
  5. The hotel rate seems a little on the expensive side. Perhaps you can secure some cheaper accommodation in Amsterdam?

Thanks. Ijon 18:34, 30 August 2011 (UTC)[reply]

  1. I feel more tools are needed to assist Recent Changes patrollers. While I am just a masters student with a long way to go, I am under the belief that experts in Artificial Intelligence are not necessarily also experienced wikipedians. Indeed analysis of data is imperative but I feel my experience as a wikipedian would have a positive impact on the development of other automated vandalism detection tools through the discussions at the conference. I'll cover the impact the conference has on vandalSense in reponse to your question #4 below. Also the conference does not only cover automated vandalism detection. The other main topic covered is "author identification". While this wouldn't be useful for vandalism detection, it can be used to detect copyright violations on wikipedia as well as detect accounts that are just returning sockpuppets of disruptive users. I didn't include this in the request page as it does not directly impact the development of VandalSense however if feasible the capability to detecting returning vandals would certainly increase the accuracy and overal usefulness of the tool.
  2. While the tool is available for public use, I am not comfortable with the tool being used to determine live edits as it is at an early stage of it's development. For academic purposes accuracy as high 70% could be deemed acceptable, however for live wikipedia use, I'd be more comfortable with accuracy as high as 95% and up. One problem was the learning set was about 1000 edits per language and that wasn't nearly enough for the kind of results I hope to get. I'll email you the url to the tool which is on a test server (it cannot handle too many people using it at once, yet).
  3. Per reason #2 the tool is currently unknown as far as the general community is concerned. Discussions so far has been private and limited. I just feel it is too early for a detailed discussion on the tool.
  4. There are many challenges in automated vandalism detection as the nature of edits can be unpredictable. Also the reports for the individual projects of groups participating in the workshop are capped at 10 pages or less (depending on the project) leaving out a lot of information making it difficult to determine how other people tackled such problems. For instance even very offensive or racist words can be welcome on some articles (such as articles talking about the origin of such words) while remain disruptive in the rest of the site. Discussions would reveal the strengths and weaknesses of VandalSense giving me the feedback I need from people with deep understanding of AI to improve VandalSense. I am particularly interested in different methodologies others used to counter the setbacks such as the one mentioned above.
  5. Unfortunately Amsterdam is a rather expensive city from what I can tell. The hotel I picked is the one recommended by the organizers and is also the site of the venue and I am already getting a discounted rate based on the official conference site. The daily price is 120 euros (without tax) which adds up to 480 euros and with 5% tax that ends up with 500,76 euros total price tag. The other alternatives the venue recommends are similarly priced (except one) and are kilometers away. Staying at the venue site has the added benefit of discussing more about the topics covered. So, I unfortunately lack a cheaper alternative that isn't kilometers away.
-- とある白い猫 chi? 23:25, 30 August 2011 (UTC)[reply]

Funded by WMDE[edit]

The WMF referred the request to Wikimedia Deutschland. WMDE agreed to fund it. This grant request can still serve as a record of this, and for the eventual report. Ijon 20:58, 16 September 2011 (UTC)[reply]