Research talk:STiki 1 million reverts review/Work log/2016-10-28

Add topic
From Meta, a Wikimedia project coordination wiki

Friday, October 28, 2016[edit]

Today, I'm working to build a monthly counts dataset of STiki reverts. The following query should get

  • month
  • good_faith (True or False) was the revert marked as good-faith
  • anon_reverted (True or False) was the user who was reverted an anon
  • reverted_edits (int) the total number of reverted edits of this type in month
  • reverts (int) the total number of reverting edits of this type in month


SELECT
  month,
  good_faith,
  anon_reverted,
  SUM(reverted_edits) AS reverted_edits,
  SUM(reverts) AS reverts
FROM (
  SELECT
    LEFT(rev_timestamp, 6) AS month,
    rev_comment LIKE "%WP:AGF%" AS good_faith,
    (
      rev_comment RLIKE ".*\\[\\[Special:Contributions/[0-9\\.]+\\|.*" OR
      rev_comment RLIKE ".*\\[\\[Special:Contributions/([0-9A-F]{1,4}\\:){7}[0-9A-F]{1,4}\\|.*"
    ) AS anon_reverted,
    SUM(CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(rev_comment, " ", 2), " ", -1) AS INT)) AS reverted_edits,
    COUNT(*) AS reverts
  FROM revision
  WHERE
    rev_comment LIKE "Reverted%STiki%"
  GROUP BY 1,2,3
  UNION ALL
  SELECT
    LEFT(ar_timestamp, 6) AS month,
    ar_comment LIKE "%WP:AGF%" AS good_faith,
    (
      ar_comment RLIKE ".*\\[\\[Special:Contributions/[0-9\\.]+\\|.*" OR
      ar_comment RLIKE ".*\\[\\[Special:Contributions/([0-9A-F]{1,4}\\:){7}[0-9A-F]{1,4}\\|.*"
    ) AS anon_reverted,
    SUM(CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(ar_comment, " ", 2), " ", -1) AS INT)) AS reverted_edits,
    COUNT(*) AS reverts
  FROM archive
  WHERE
    ar_comment LIKE "Reverted%STiki%"
  GROUP BY 1,2,3
) AS revert_activity;

I have this running now. I'll report on some basic stats when it's done. I expect it to take at least a few hours to run. --EpochFail (talk) 15:55, 28 October 2016 (UTC)Reply