Grants:IdeaLab/BadWords detector for AbuseFilter/Technical description

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

Technical description for a BadWords detector for AbuseFilter or perhaps a foul words detector would be more accurate. This is a free rewrite over an idea from nowiki.

The problemn this tries to solve is that foul words are extremely hard to detect in some languages due to the languages being agglutinative and fusional. If a language use single non-joining morphemes then it would be a pretty easy task to assign them values describing their foulness. Very often the morphemes isn't that much changed, and they can be extracted and the individual morphemes assigned values, or we can assign values to the overall constructs.

Agglutinative languages allow constructs like "hestpeis"/"hæstpeis" and "hestkuk"/"hæstkuk" (literally "dick of a horse") to be used as a new foul words, even if the word isn't defined as such in the language.[1] In this case the word comes from the dialect spoken in Nordland in Norway. Some of the search engines has serious problems with understanding such words.[2]

Similar foul words can be built like "måspeis"/"måskuk" (literally "dick of a seagull") and also as "hestskjit!"/"hæstskjit" (lierally "horse manure"). The previous foul words can be broken down to some kind of animal and genital or fecal descriptive words. Because the set of possible constructs are very large it will be difficult to assign correct tags.

Basic algorithm[edit]

Assume that we can describe a limited set of words, for example animal names, by using json (this is nice format because many of the tools are already in place)

"animal": [
  "hest", "hæst", "mås"
]

We can also describe genital names

"genital": [
  "pikk", "peis", "kuk"
]

Or fecal names

"fecal": [
  "dritt", "skjit"
]

In addition we define rules for how we join words from those sets

"rule": [
  "<animal><genital>",
  "<animal><fecal>"
]

It is rather visual whats going on here, it is a description of a state engine where our morphemes form states and the rules describes state transitions. Such a state engine can be built as a finite-state machine (finite-state automata)), and only when it has accepted the complete pattern it has detected a foul word. This works for plain agglutinative languages, but with fusional languages there is additional problems.

Extended algorithm[edit]

In addition to the plain chaining of words it is necessary to do limited inflection, usually only by applying affix rules. Such rules can be like

"plural": [
  {
    "type": "suffix",
    "match": "[sxc]$",
    "replace": "$1'"
  },
  {
    "type": "suffix",
    "match": "[sxc]$",
    "replace": "$1a"
  },
  {
    "type": "suffix",
    "match": "[sxc]$",
    "replace": "$1e"
  },
]

The previous rules will then turn into something like

"rule": [
  "<plural><animal><genital></plural>",
  "<plural><animal><fecal></plural>"
]

This will not be a plain finite-state machine (or more accurately a finite-state automata) but a finite-state transducer. In the example it matches on the state diagram and tries to build the source, but it is easier to match against the source trying to build a normalized form, and then using that as an input to the previous FSM.

Learned weights[edit]

The learned weights can in many cases be calculated by hand, but they must somehow be assigned to the rules. To make this possible our simple morphemes could be turned into objects. Often the probability that the individual morphemes are used as foul words are pretty low, it is the merged word that is used as foul word. Due to this we switch between a few preset values. The previous will be expressed as

"rules": [
  {
    "rule": "<animal><genital>",
    "weights": {
      "foul": 0.9
    }
  },
  {
    "rule": "<animal><fecal>",
    "weights": {
      "foul": 0.9
    }
  }
]

In the previous weights for "foul" is added as one dimension, there could be other such dimensions, and they could be learned by various techniques. Learning more accurate weights would be like other techniques for sentiment analysis.

Some of the morphemes can infact be foul words by themselves. Examples from the previous are "pikk" and "kuk" from the "genitals" section, while "peis" on its own can't be deducted as a foul word. This leads to an idea that each of the morphemes can be given weights, and that morphemes without weights simply does not act on their own and must be part of a rule.

Notes[edit]

Moderate morphemes/words[edit]

In some cases there are more moderate morphemes that isn't offensive in nature. That could be solved with similar rules, but one section for offensive rules and one for moderate rules. In a moderate section words like "møkk" of "lort" (literally "feces") could be added, and weights set or learned separately.

See also[edit]

References[edit]

Literature[edit]

  • Jurafsky, Daniel; Martin, James H. (2000). Speech and language processing : an introduction to natural language processing, computational linguistics, and speech recognition. Prentice Hall. ISBN 9780135041963.