Research:Automated classification of edit types/Taxonomy

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

This page documents a complete and inclusive taxonomy. The goal is to capture all potential change types that describe editing activity on Wikipedia. A practical subset will be used for the automated classification system, but we leave the identification of this practical subset to other discussion.


These classes describe "what" was done during an edit. (As opposed to "why")

Mechanical operations[edit]

These types of changes can be detected with simple regular expressions

  • wiki links
    • insert/delete
    • modify
      • disambiguate
  • inter-wiki links
    • insert/modify/delete
  • external links
    • insert/modify/delete
  • category
    • insert/modify/delete
  • headers
    • insert/modify/delete
  • table
    • insert/modify/delete
  • image
    • insert/modify/delete
  • references
    • insert/modify/delete
  • content move / refactor
  • redirect
  • cleanup
    • punctuation
      • insert/delete
    • whitespace
      • insert/delete
    • formatting -- css/style/bold/italics

Abstract/probabilistic operations[edit]

These classes can't be detected trivially with regular expressions. They would require some machine prediction.

  • Grammar (word-level)
    • punctuation, whitespace
    • spelling error, typo
    • capitalization
    • tense change
  • Rephrase (word-level)
    • synonym
    • remove redundant words
  • Sentence (sentence-level)
    • insert/modify/delete (substantive)


These classes describe "why" an edit was made. They usually amount to subjective applications of policy.

  • NPOV
  • Vandalism
  • Notable?
  • External link policy
  • Manual of style
  • New topic (article creation)

Complex operations[edit]

These classes describe changes that are part of a multi-edit operation

  • Merge
  • Archiving


These classes describe actions relevant to a discussion.

  • New topic
  • Reply
  • !Vote (Support/oppose)
  • Comment signing
  • Suggestion
  • WP tagging/assessment