Research talk:VisualEditor's effect on newly registered editors/May 2015 study/Work log/2015-06-05
Add topicAppearance
Latest comment: 10 years ago by Halfak (WMF) in topic Friday, June 5, 2015
Friday, June 5, 2015
[edit]Bucketing just ended. I'm going to take the opportunity to get a head-start on the analysis. Time to gather our bucketed users.
First, let's look at the time bounds:
> SELECT
-> LEFT(user_registration, 10) AS hour,
-> SUM(ve.up_user IS NOT NULL)/COUNT(*) AS ve_prop
-> FROM user
-> LEFT JOIN user_properties AS ve ON
-> up_user = user_id AND
-> up_property = 'visualeditor-enable'
-> WHERE
-> user_registration BETWEEN "20150528" AND "20150605" AND
-> user_id > 25241662
-> GROUP BY 1;
+------------+---------+
| hour | ve_prop |
+------------+---------+
| 2015052800 | 0.0061 |
<... snip ...>
| 2015052822 | 0.0000 |
| 2015052823 | 0.2964 |
| 2015052900 | 0.3030 |
| 2015052901 | 0.3891 |
| 2015052902 | 0.3196 |
| 2015052903 | 0.3969 |
| 2015052904 | 0.3706 |
| 2015052905 | 0.3234 |
| 2015052906 | 0.3603 |
| 2015052907 | 0.3246 |
| 2015052908 | 0.3404 |
| 2015052909 | 0.3237 |
<... snip ...>
| 2015060417 | 0.3683 |
| 2015060418 | 0.3634 |
| 2015060419 | 0.3564 |
| 2015060420 | 0.3366 |
| 2015060421 | 0.3455 |
| 2015060422 | 0.3517 |
| 2015060423 | 0.1958 |
+------------+---------+
192 rows in set (0.99 sec)
It looks like "2015052823" through "2015060423" will work as expected.
Time to get the sample -- same as the pilot with different date bounds:
SELECT
event_userId AS user_id,
IF(event_userId % 2 = 0, "experimental", "control") AS bucket,
timestamp AS registration,
event_displayMobile AS via_mobile,
ve.up_user IS NOT NULL AS ve_enabled
FROM log.ServerSideAccountCreation_5487345
LEFT JOIN enwiki.user_properties ve ON
event_userId = up_user AND
up_property = 'visualeditor-enable'
WHERE
wiki = "enwiki" AND
event_isSelfMade AND
timestamp BETWEEN "2015052823" and "2015060423";
$ head -n 3 experimental_users.tsv; wc experimental_users.tsv user_id bucket registration via_mobile ve_enabled 25324895 control 20150528230028 1 0 25324896 experimental 20150528230034 0 0 26972 134860 1038541 experimental_users.tsv
And there we have it. About 27k experimental users. --Halfak (WMF) (talk) 13:55, 5 June 2015 (UTC)
I just updated the rest of the queries, but I ran out of time. We should be in a good shape for when I actually get to do a prelim analysis. I'm guessing that's going to be Monday morning. For now, I'm off to work on other things. --Halfak (WMF) (talk) 14:54, 5 June 2015 (UTC)