Research:Understanding Wikidata's Value/semi-automated tool edit indicators

From Meta, a Wikimedia project coordination wiki

Below are a list of regular expressions that I believe match Wikidata revision comments of tool edits. It's Postgres syntax since I've imported revision data into a Postgres DB.

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%#quickstatements%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%#petscan%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%#autolist2%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%talkgadgetautoeditjs|autoedit]]%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%labellister%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%#itemcreator%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%#dragrefjs%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%[[useryms/lc|lcjs]]%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%#wikidatagame%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%[[wikidataprimary%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%mix''n''match%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%#distributedgame%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%[[userjitrixis/nameguzzlerjs|nameguzzler]]%' 

lower(regexp_replace(comment, '\.|,|\(|\)|-|:','','g')) LIKE '%[[mediawikigadgetmergejs|mergejs]]%'


Additionally, per the advice of Magnus, revisions from his agent, "User:QuickStatementsBot", will also be flagged as a tool edit.

Finally, per the advice of Tilman and Aaron Halfaker, I'm also checking the beginning of change tags for the phrase "OAuth CID:" to indicate that a revision was produced via a tool.