Research:Revision scoring as a service/Word lists/vo

From Meta, a Wikimedia project coordination wiki


ISO code Language Generated list Badwords Informal words Stopwords Dictionary Stemmer Contact person Wiki labels Interface Forms Campaign Needs
vo Volapük (Wikipedia) 250 - - - - - See: Word lists requested no no no -
Generated list [1]

Words in the generated list commonly appear in reverted revisions but not in others. This list is generated using a TF-IDF approach.

  1. about
  2. according
  3. african
  4. age
  5. ain
  6. aise
  7. all
  8. allswingersclubs
  9. alone
  10. american
  11. any
  12. apps
  13. are
  14. area
  15. arrondissement
  16. arrondissementsgard
  17. article
  18. articles
  19. asian
  20. auto
  21. average
  22. background
  23. bauchecommunefran
  24. begriffsklärung
  25. below
  26. belödanis
  27. berkeley
  28. bezirk
  29. big
  30. binon
  31. black
  32. bold
  33. border
  34. bot
  35. both
  36. bots
  37. botspam
  38. category
  39. cdp
  40. cellpadding
  41. cellspacing
  42. children
  43. city
  44. class
  45. clear
  46. com
  47. communities
  48. coor
  49. couples
  50. cutie
  51. delete
  52. demographics
  53. density
  54. departamant
  55. departament
  56. departamentas
  57. departement
  58. departementet
  59. desambigua
  60. designated
  61. dipartiment
  62. disambiguation
  63. div
  64. dms
  65. dona
  66. donaziläks
  67. doorverwijspagina
  68. doton
  69. efefef
  70. ela
  71. every
  72. external
  73. families
  74. family
  75. female
  76. females
  77. ffdddd
  78. font
  79. for
  80. franciae
  81. fransäna
  82. fterran
  83. fuck
  84. fucking
  85. garas
  86. gard
  87. gardon
  88. geban
  89. geographic
  90. geography
  91. geschützt
  92. greatthings
  93. had
  94. haha
  95. harald
  96. harrison
  97. hart
  98. has
  99. have
  100. hintergrundfarbe
  101. hispanic
  102. homonymie
  103. household
  104. householder
  105. households
  106. housing
  107. http
  108. husband
  109. ilmap
  110. image
  111. important
  112. including
  113. income
  114. individuals
  115. infoboxcommunedefranc
  116. information
  117. interwikis
  118. islander
  119. its
  120. kaed
  121. kantons
  122. koordinats
  123. krichel
  124. labon
  125. land
  126. languedoc
  127. latino
  128. license
  129. light
  130. line
  131. links
  132. living
  133. located
  134. location
  135. low
  136. lödana
  137. made
  138. makeup
  139. males
  140. margin
  141. marriage
  142. married
  143. median
  144. mes
  145. mile
  146. montgomery
  147. more
  148. motherfuckers
  149. msmap
  150. name
  151. native
  152. nbsp
  153. nci
  154. net
  155. non
  156. none
  157. novul
  158. nuvola
  159. olan
  160. older
  161. org
  162. other
  163. out
  164. over
  165. pacific
  166. padilon
  167. pads
  168. partement
  169. pennsylavnia
  170. people
  171. pha
  172. place
  173. places
  174. pontdugard
  175. population
  176. position
  177. poverty
  178. praefectura
  179. present
  180. pulished
  181. quality
  182. race
  183. races
  184. racial
  185. rahmenfarbe
  186. red
  187. references
  188. region
  189. relative
  190. residing
  191. roussillon
  192. santacatarina
  193. sign
  194. size
  195. solid
  196. someone
  197. spam
  198. span
  199. spread
  200. square
  201. states
  202. stem
  203. stop
  204. stub
  205. style
  206. sucks
  207. sup
  208. sürfat
  209. taibjonik
  210. the
  211. them
  212. there
  213. this
  214. those
  215. title
  216. together
  217. topam
  218. topons
  219. total
  220. town
  221. two
  222. type
  223. täpsustus
  224. ujednoznacznienie
  225. under
  226. united
  227. units
  228. user
  229. utc
  230. versus
  231. vigan
  232. violation
  233. voirhomonyme
  234. vorlage
  235. was
  236. water
  237. weight
  238. were
  239. which
  240. white
  241. who
  242. wikipedias
  243. with
  244. www
  245. years
  246. your
  247. zifs
  248. ziläkanüm
  249. ziläks
  250. älifölis
Generated common words

Common words appear on all revisions reverted or otherwise. In the English language this would include words like 'the' or 'is' which are meaningless on their own. This list is generated using a TF-IDF approach.

  1. alignbars
  2. and
  3. areas
  4. bai
  5. bal
  6. bala
  7. belödanis
  8. bevüresod
  9. bidäda
  10. bidädas
  11. blue
  12. blägans
  13. bottom
  14. brazil
  15. bureau
  16. bäldot
  17. bäldoti
  18. bäldotü
  19. calif
  20. californi
  21. capita
  22. category
  23. census
  24. cilis
  25. coa
  26. color
  27. colors
  28. commons
  29. commonscat
  30. county
  31. dabinons
  32. dabinöl
  33. dateformat
  34. dekul
  35. del
  36. demü
  37. densit
  38. dis
  39. disambiguation
  40. donaziläk
  41. dono
  42. doton
  43. ela
  44. end
  45. famül
  46. famüla
  47. famülas
  48. famüls
  49. file
  50. fransänik
  51. from
  52. front
  53. gads
  54. geilot
  55. gemeente
  56. gen
  57. gray
  58. height
  59. highlighted
  60. himatan
  61. horizontal
  62. http
  63. iamap
  64. iaus
  65. illinois
  66. image
  67. imagesize
  68. imtmetis
  69. incorporated
  70. increment
  71. insee
  72. inwonertal
  73. italia
  74. jean
  75. jonon
  76. jpg
  77. justify
  78. kalifornia
  79. kalifornija
  80. kanton
  81. kela
  82. keninükamü
  83. klad
  84. kommun
  85. komot
  86. komotanem
  87. koord
  88. koordinats
  89. koräkami
  90. kot
  91. labon
  92. labü
  93. lamerikän
  94. lamerikänik
  95. latinans
  96. layer
  97. left
  98. leigodü
  99. lemesed
  100. lemesedi
  101. les
  102. lifayelas
  103. lifayels
  104. lindiyans
  105. linedata
  106. lomanef
  107. lomanefa
  108. lomanefas
  109. lomanefs
  110. lunetü
  111. luv
  112. län
  113. läs
  114. lödanadensit
  115. lödanas
  116. lödanef
  117. lödanefa
  118. lödans
  119. lödöp
  120. lödöps
  121. lölik
  122. magod
  123. mans
  124. map
  125. matans
  126. mens
  127. nedöls
  128. nek
  129. nem
  130. nen
  131. nia
  132. nisulans
  133. nonik
  134. nord
  135. norte
  136. north
  137. nova
  138. nueva
  139. numäd
  140. numädabür
  141. numäds
  142. nüns
  143. old
  144. orientation
  145. pads
  146. pasifeana
  147. patedik
  148. pato
  149. penumädöls
  150. per
  151. period
  152. plad
  153. plotarea
  154. plotdata
  155. plu
  156. png
  157. points
  158. potakot
  159. pädugons
  160. pöfasoliad
  161. pösod
  162. pösods
  163. ragiv
  164. right
  165. rnia
  166. saint
  167. sant
  168. scalemajor
  169. sent
  170. sifal
  171. sifalüp
  172. sin
  173. siyopans
  174. sköt
  175. soldats
  176. stad
  177. start
  178. state
  179. statitabür
  180. statitabüra
  181. studans
  182. sur
  183. svg
  184. sürfat
  185. sürfati
  186. taledavik
  187. tat
  188. teksas
  189. telna
  190. telplänov
  191. the
  192. thumb
  193. till
  194. tim
  195. timeaxis
  196. timeline
  197. timü
  198. tiäd
  199. top
  200. topam
  201. topon
  202. tot
  203. unincorporated
  204. unit
  205. usa
  206. utanas
  207. valasotik
  208. valodik
  209. valodo
  210. value
  211. vat
  212. videtü
  213. vietans
  214. vom
  215. vomas
  216. voms
  217. votik
  218. votikami
  219. wappen
  220. was
  221. width
  222. www
  223. year
  224. yel
  225. yela
  226. yels
  227. york
  228. yyyy
  229. zao
  230. zif
  231. zifs
  232. zänedo
  233. äbinon
  234. äbinons
  235. äbinädon
  236. äbinädons
  237. ädabinoms
  238. ädabinons
  239. äfomons
  240. äkeninükons
  241. äkobolödöl
  242. älaboms
  243. älabon
  244. älabons
  245. älifons
  246. älödons
  247. älödölis
  248. äsoelöl

Bad words

Bad words are words unwelcome on any page. This would include curse words, spam and other content that would be reverted regardless of where it is inserted.

Needs bad words... Use |list-badwords=

Informal words

Informal words are words unwelcome on article namespace but would be acceptable on talk pages. This would include words such as 'hello' or 'hahaha' which would be fine in discussions but not in articles.

Needs informal words... Use |list-informal=