Research:Automated classification of edit types/Taxonomy
This page documents a complete and inclusive taxonomy. The goal is to capture all potential change types that describe editing activity on Wikipedia. A practical subset will be used for the automated classification system, but we leave the identification of this practical subset to other discussion.
These classes describe "what" was done during an edit. (As opposed to "why")
These types of changes can be detected with simple regular expressions
- wiki links
- inter-wiki links
- external links
- content move / refactor
- formatting -- css/style/bold/italics
These classes can't be detected trivially with regular expressions. They would require some machine prediction.
- Grammar (word-level)
- punctuation, whitespace
- spelling error, typo
- tense change
- Rephrase (word-level)
- remove redundant words
- Sentence (sentence-level)
- insert/modify/delete (substantive)
These classes describe "why" an edit was made. They usually amount to subjective applications of policy.
- External link policy
- Manual of style
- New topic (article creation)
These classes describe changes that are part of a multi-edit operation
These classes describe actions relevant to a discussion.
- New topic
- !Vote (Support/oppose)
- Comment signing
- WP tagging/assessment