Research talk:Asking anonymous editors to register/Work log/2014-07-14

From Meta, a Wikimedia project coordination wiki

Monday, July 14th[edit]

Today, I'm just making sure that the logging for the second version of the experiment is coming in.

> select left(timestamp, 8), count(*) from SignupExpPageLinkClick_8965014 group by 1;
+--------------------+----------+
| left(timestamp, 8) | count(*) |
+--------------------+----------+
| 20140709           |       21 |
| 20140710           |    69351 |
| 20140711           |    80236 |
| 20140712           |    69883 |
| 20140713           |    72725 |
| 20140714           |    84184 |
+--------------------+----------+

So it looks like the events start coming in on 7/10. What time?

> select left(timestamp, 10), count(*) from SignupExpPageLinkClick_8965014 where left(timestamp, 8) = "20140710" group by 1;
+---------------------+----------+
| left(timestamp, 10) | count(*) |
+---------------------+----------+
| 2014071000          |      996 |
| 2014071001          |     1336 |
| 2014071002          |     1542 |
| 2014071003          |     1697 |
| 2014071004          |     2006 |
| 2014071005          |     2137 |
| 2014071006          |     2376 |
| 2014071007          |     2285 |
| 2014071008          |     2768 |
| 2014071009          |     2697 |
| 2014071010          |     2838 |
| 2014071011          |     3174 |
| 2014071012          |     3611 |
| 2014071013          |     4105 |
| 2014071014          |     4011 |
| 2014071015          |     4293 |
| 2014071016          |     3794 |
| 2014071017          |     3817 |
| 2014071018          |     3759 |
| 2014071019          |     3782 |
| 2014071020          |     3495 |
| 2014071021          |     3243 |
| 2014071022          |     2942 |
| 2014071023          |     2647 |
+---------------------+----------+
24 rows in set (0.18 sec)

It looks like we're having a substantial amount of events coming in by midnight, but the rate they come in seems to ramp up over time. Maybe that's an artifact of the time of day. Let's compare to the next day. --Halfak (WMF) (talk) 22:31, 14 July 2014 (UTC)[reply]


mysql:research@analytics-store.eqiad.wmnet [log]> select hour, july10.events, july11.events, july10.events-july11.events from (select substr(timestamp, 9, 2) as hour, count(*) as events from SignupExpPageLinkClick_8965014 where left(timestamp, 8) = "20140710" group by 1) as july10 INNER JOIN (select substr(timestamp, 9, 2) as hour, count(*) as events from SignupExpPageLinkClick_8965014 where left(timestamp, 8) = "20140711" group by 1) as july11 USING (hour);
+------+--------+--------+-----------------------------+
| hour | events | events | july10.events-july11.events |
+------+--------+--------+-----------------------------+
| 00   |    996 |   2539 |                       -1543 |
| 01   |   1336 |   2511 |                       -1175 |
| 02   |   1542 |   2652 |                       -1110 |
| 03   |   1697 |   2783 |                       -1086 |
| 04   |   2006 |   2871 |                        -865 |
| 05   |   2137 |   3028 |                        -891 |
| 06   |   2376 |   3246 |                        -870 |
| 07   |   2285 |   3117 |                        -832 |
| 08   |   2768 |   3331 |                        -563 |
| 09   |   2697 |   3778 |                       -1081 |
| 10   |   2838 |   3604 |                        -766 |
| 11   |   3174 |   3929 |                        -755 |
| 12   |   3611 |   3832 |                        -221 |
| 13   |   4105 |   4357 |                        -252 |
| 14   |   4011 |   4412 |                        -401 |
| 15   |   4293 |   4102 |                         191 |
| 16   |   3794 |   3839 |                         -45 |
| 17   |   3817 |   3671 |                         146 |
| 18   |   3759 |   3759 |                           0 |
| 19   |   3782 |   3538 |                         244 |
| 20   |   3495 |   3394 |                         101 |
| 21   |   3243 |   2904 |                         339 |
| 22   |   2942 |   2524 |                         418 |
| 23   |   2647 |   2515 |                         132 |
+------+--------+--------+-----------------------------+
24 rows in set (0.26 sec)

It looks like the rates aren't really comparable until the 15th hour of July 10th. I'm not sure that this matters. At worst, a slow JS caching issue should only add noise to the analysis (e.g. users in the experimental condition that will have a control-like experience). As in the last experiment, we can simply filter these users out of the analysis by only examining editors with a corresponding page-click event. --Halfak (WMF) (talk) 18:33, 15 July 2014 (UTC)[reply]