Research talk:Anonymous editor acquisition/Signup CTA experiment/Work log/2014-04-03

From Meta, a Wikimedia project coordination wiki

Thursday, April 3rd[edit]

Today I need to build an estimate of the number of anon editors that we see in a 1-2 week period so that we can figure out how long our experiment will need to last. I'm going to assume boolean metrics and a 2-3% change. Luckily, I've already done this analysis for a previous study. See File:Onboarding.rollout.proportion_test.pvalue_by_observations.svg. It looks like we should have no problem identifying significant effects with 5000 users per condition. That means I need 15k users to cover the three conditions (control, pre-edit CTA & post-edit CTA).

Until I can make use of tokens, my best way to estimate the # of anon editors is by looking for unique IP addresses and user agents over time using the cu_chamges table (see mw:Extension:CheckUser). --Halfak (WMF) (talk) 15:52, 3 April 2014 (UTC)[reply]


Created project repo: https://github.com/halfak/Anonymous-phenomena

I'm hoping to use this for all anon studies -- not just the current one. --Halfak (WMF) (talk) 15:55, 3 April 2014 (UTC)[reply]


I made the following figure to make sure that the timeline of the experiment was clear (assuming two week observation period):

The timeline of experimentation and observation is presented.
Timeline. The timeline of experimentation and observation is presented.

I'll have to modify it when I'm done with the power analysis. --Halfak (WMF) (talk) 15:55, 3 April 2014 (UTC)[reply]


> SELECT
    ->     COUNT(*) AS ip_agent_count,
    ->     COUNT(DISTINCT cuc_ip) AS ip_count
    -> FROM (
    ->     SELECT
    ->         cuc_ip,
    ->         cuc_agent
    ->     FROM cu_changes
    ->     WHERE
    ->         cuc_user = 0 AND
    ->         cuc_timestamp BETWEEN "20140325" AND "20140401"
    ->     GROUP BY 1,2
    -> ) AS unique_ip_agent;
+----------------+----------+
| ip_agent_count | ip_count |
+----------------+----------+
|          97847 |    92969 |
+----------------+----------+
1 row in set (5.15 sec)

So, if IP or IP/Agent is a good way of estimating the # of users, it looks like we'll have about 92k-97k anons to work with in one week of experimentation. We might even get twice that many attempting to edit. We don't want to do less than a week due to the periodic nature of weekends and weekdays. Time to update the experiment duration to reflect a week. --Halfak (WMF) (talk) 21:03, 3 April 2014 (UTC)[reply]