Schema talk:MobileWebWatchlistClickTracking

From Meta, a Wikimedia project coordination wiki
Maintainer:Jon Katz & Sam Smith
Team:Reading - Web
Please specify the project using this schema.
Status:
inactive
Purge:Bucketize userEditCount and auto-purge username and destination + eventCapsule PII after 90 days

Definition of "...watch" and "...unwatch" events[edit]

I have recently been looking into the data from this schema, assuming that it contains some information about how often pages are being watched and unwatched on mobile web. However, the watchlist-a-z-watch and watchlist-a-z-unwatch events seem to record something else - in particular, there are many more "unwatch" than "watch" events:

SELECT DATE(timestamp) AS date, 
SUM( IF(event_name = 'watchlist-a-z-watch', 1, 0)) AS watches,
SUM( IF(event_name = 'watchlist-a-z-unwatch', 1, 0)) AS unwatches
FROM log.MobileWebWatchlistClickTracking_10720361 
WHERE MONTH(timestamp) =3
AND YEAR(timestamp) = 2016
GROUP BY date ORDER BY date;

+------------+---------+-----------+
| date       | watches | unwatches |
+------------+---------+-----------+
| 2016-03-01 |       9 |        43 |
| 2016-03-02 |       8 |       115 |
| 2016-03-03 |       5 |        55 |
| 2016-03-04 |       6 |        71 |
| 2016-03-05 |       9 |        76 |
| 2016-03-06 |       6 |        32 |
| 2016-03-07 |       6 |        38 |
| 2016-03-08 |       3 |        20 |
| 2016-03-09 |       5 |        91 |
| 2016-03-10 |       4 |        58 |
| 2016-03-11 |      15 |        62 |
| 2016-03-12 |       8 |        55 |
| 2016-03-13 |       5 |       100 |
| 2016-03-14 |      11 |        29 |
| 2016-03-15 |       3 |        47 |
| 2016-03-16 |       3 |        36 |
| 2016-03-17 |       9 |        47 |
| 2016-03-18 |      11 |       165 |
| 2016-03-19 |       7 |        40 |
| 2016-03-20 |      12 |       121 |
| 2016-03-21 |      22 |        98 |
| 2016-03-22 |      16 |       165 |
| 2016-03-23 |      12 |        86 |
| 2016-03-24 |       4 |        49 |
| 2016-03-25 |      11 |        94 |
| 2016-03-26 |       6 |       110 |
| 2016-03-27 |      22 |       106 |
| 2016-03-28 |       4 |        65 |
| 2016-03-29 |       5 |        42 |
| 2016-03-30 |       7 |        31 |
| 2016-03-31 |       5 |        35 |
+------------+---------+-----------+
31 rows in set (1.10 sec)

Just leaving this here for the record, as I won't use be using that data for now. But it might be worth improving the documentation on the meaning of these events in case we need to look again into whatever questions the corresponding data was meant to answer. Regards, Tbayer (WMF) (talk) 08:10, 22 May 2017 (UTC)[reply]