Research talk:Modeling monthly active editors/Work log/2014-06-19

From Meta, a Wikimedia project coordination wiki

Thursday, June 19th[edit]

Today, I need to do two things.

  1. Filter R:Attached users from the New Active editor groups
  2. Compare the total monthly active editors with this metric to those discussed at R:Refining the definition of monthly active editors

In order to filter attached users, I'm making use of the tables managed by mw:Extension:CentralAuth. I used the following query to build a table of all editors with their "attachment method" that will help me determine if they had local accounts created by central auth.

SELECT
    DATABASE() AS wiki,
    user_id AS user_id,
    user_registration AS user_registration,
    gu_id AS globaluser_id,
    lu_attached_timestamp AS user_attached,
    lu_attached_method AS attached_method
FROM user
LEFT JOIN centralauth.localuser ON 
    lu_wiki = DATABASE() AND
    lu_name = user_name
LEFT JOIN centralauth.globaluser ON
    gu_name = lu_name
GROUP BY user_id;

Now, all I need to do is to modify my MAE script to filter these guys out of New active editor and Surviving new active editor. --Halfak (WMF) (talk) 19:04, 19 June 2014 (UTC)[reply]


Updates complete. Looks like the difference is minimal.

Updated plots[edit]

Active editor rates (enwiki). 
Active editor rates (itwiki). 
Active editor counts stacked (enwiki). 
Active editor counts stacked (itwiki). 
Active editor counts (enwiki). 
Active editor counts (itwiki). 

Comparison with old active editor definitions[edit]

  • dump ns=0 -- Old definition based on XML dump processing. Does not include edits to deleted pages and also filters by R:Countable pages
  • archive ns=0 -- All edits to articles count.
  • archive ns=all -- All edits to any page counts.
MAE comparison (enwiki). 
MAE comparison (itwiki). 

Now for some factor comparisons.

MAE comparison (dump ns=0). 
MAE comparison (archive ns=0). 
MAE comparison (archive ns=all). 

--Halfak (WMF) (talk) 22:37, 19 June 2014 (UTC)[reply]