Grants:IdeaLab/Use Facebook's DeepText to classify if a new post has potential harassment

From Meta, a Wikimedia project coordination wiki
Use Facebook's DeepText to classify if a new post has potential harassment
Deep Neural Networks could be trained to classify text and quickly score if there is a potential harassment
idea creator
this project needs...
created on10:38, 3 June 2016 (UTC)

Project idea[edit]

What is the problem you're trying to solve?[edit]

New posts could have harassement or inappropriate content and it's hard to review and block it by a human force. Especially if such content is generated by a bot. An AI filter could at least be on par with bot, and identify possible inapproproate content immediately and suggest it for a human review before posting.

What is your solution?[edit]

Train an AI based on DeepText to classify potential inappropriate content. Such content is not allowed to be posted without a further approval by a human reviewer. All the content identified as apropriate is allowed to be posted immediately


  1. Set AI automated first line of defense barrier (a kind of immune system) against inappropriate content with help of an already existing technology (Deep Learning for text).
  2. Partnership with hosts of the technology for a win-win solution. The host contributes the engine, support. And in reward it get's recognition (Ad) as "The AI which keeps wiki's health!", plus a good traffic to train the system.

Get Involved[edit]

About the idea creator[edit]



  1. I don't think we need to worry about harassment by bot, or rather if that ever happened both the bot and its operator need to be blocked. We do have abuse filters (known as edit filters on the English Wikipedia) and if someone can specify it I'm sure we could have more filters for harassment. The trick is to define the abuse simply enough to code in a filter. Where there may be scope for the WMF to do something is in the hardware that processes those filters in real time; My understanding is that on English at least we are at or close to the point where filters are slowing down the saving of all edits. If so, increasing the hardware capacity that process them would increase the capacity for anti harassment edits including ideally an AI filter that had been trained from previous harassment to spot and stop further harassment. WereSpielChequers (talk) 11:14, 3 June 2016 (UTC)
  • easy and quick solution, isn´t it? Not for everything, but for some forms at least it should work. Rikuti (talk) 11:50, 3 June 2016 (UTC)
  • I think AI and automation is definetly the way to go. It can be hard to achieve, but if it works, the problems of this kind will be thing of the past. Not only this, though, now editors will have to spend less time caring about community issues and more quality articles can be created. Biels (talk) 16:15, 3 June 2016 (UTC)
  • I want to see this working. It's a good idea Victorvic1 (talk) 21:18, 13 June 2016 (UTC)
  • Yes, the WMF has a very good AI team (e.g. the ORES quality ratings). If it could be done as effectively and more simply with bots, then so much the better. As far as slowing down the save function - it would only be needed on talk pages. Can be integrated with the similar proposals below. Smallbones (talk) 23:25, 13 June 2016 (UTC)
  • If automation could at least help flag suspect content, it would be an enormous time-saver. Rrusson (talk) 15:37, 25 June 2016 (UTC)

Expand your idea[edit]

Would a grant from the Wikimedia Foundation help make your idea happen? You can expand this idea into a grant proposal.

Expand into a Rapid Grant
Expand into a Project Grant
(launching July 1st)

Similar Ideas[edit]

Maybe we should merge this ideas.