Schema talk:Popups

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search
J Train (1973-1979).svg
Maintainer:Phuedx & OVasileva
Team:Reading
Project:Popups
Status:active
Purge:Auto-purge eventCapsule PII, pageIdSource, pageTitleHover, and pageTitleSource after 90 days, keep the rest indefinitely

Creating a basic schema to log analytics for the Popups extension. --Prtksxna (talk) 08:49, 11 February 2014 (UTC)


Sampling (ca. September 2016-August 2017)[edit]

Sampling will be derived by passing in the session-pegged psuedorandom mw.user.sessionId() value (ephemeral cookie) and potential buckets into mw.experiments.getBucket() and determining whether to enroll the user for event logging based on the bucket response. For example, if 10% of sessions should be enrolled for event logging, the groups might be named Control (with a value of 0.9)and B (with a value of 0.1) and if mw.experimentsgetBucket() returns B the session is in scope for event logging.

Do note that devices without sendBeacon support will not be eligible for sampling. This is by design because without sendBeacon there is a lower probability of correctly conveyed clickthroughs. This sendBeacon constraint at the moment means the data will be biased toward Firefox and Chrome users.

See also the discussion at phab:T136746#2648594 (including subsequent comments) on how the sampling works for logged-in users, interacting with the Hovercards Beta Features preference etc. Regards, Tbayer (WMF) (talk) 17:58, 22 September 2016 (UTC)


Sampling rates varying per a/b test. For it and ru a/b tests, we have no sampling for logged-in users. Sampling rates are set at 0.01 for ruwiki and 0.02 for itwiki for anon users.
- OVasileva (WMF) 09.37, 23 September 2016 (UTC)
The Popups instrumentation is enabled on all wikis that have the Popups extension loaded, i.e sewiki and all wikis that have the Beta Features extension loaded. Per 340706, the sampling rate is currently 0.05% of all distinct browser sessions. Phuedx (WMF) (talk) 14:58, 2 March 2017 (UTC)
Correction: 340706 set the sampling rate to 5% of all distinct browser sessions. Phuedx (WMF) (talk) 10:09, 20 April 2017 (UTC)
Per 343675, the sampling rate is current 0.1% of all distinct browser sessions.
Per 367398 and 366882, the sampling rate defaults to 0.1% of all distinct browser sessions and 1% of all distinct browsers sessions on hu-, it-, and ruwiki (see task T171325).


Bucketing (August 2017-)[edit]

At the time of writing [August 25, 2017], no EventLogging is occurring for any of our projects.

Each anonymous user will be assigned one of the following buckets: on; control; or off. These buckets have the following behaviours:

  • If the user is in the on bucket, then Page Previews will be enabled by default and the EventLogging instrumentation will be enabled.
  • If the user is in the control bucket, then previews will be disabled by default and the instrumentation will be enabled.
  • If the user is in the off bucket, then previews will be disabled by default and the instrumentation will be disabled.

The bucket is calculated by passing in the session-pegged psuedorandom mw.user.sessionId() value (an ephemeral cookie) and buckets into mw.experiments.getBucket(). The on and control buckets are defined as 0.5 * $wgPopupsAnonsExperimentalGroupSize (X), with the off bucket capturing the rest of the population (i.e. 1 - X).

For example, if $wgPopupsAnonsExperimentalGroupSize is 0.1 then 0.05 user sessions will be in the on and control buckets.

Do note that devices without Beacon API support will not be eligible for the EventLogging instrumentation. This is by design because without the Beacon API there is a lower probability of correctly conveyed clickthroughs. Currently, this constraint means the data will be biased toward Firefox and Chrome users.

enwiki and dewiki A/B test v2 (October 18th - November 15h)[edit]

See T176469 and T178500 for context and commentary.

On Wednesday, October 18th, we re-enabled the Page Previews enwiki and dewiki A/B test with the following bucket sizes:

Wiki Bucket Sizes (on:off:control)
enwiki 0.015:0.15:0.97
dewiki 0.04:0.04:0.92

These bucket sizes were chosen to achieve an average rate of circa 200 events/second over the course of two weeks. They're slightly tweaked versions of the bucket sizes derived in T172291#3500535.

The instrumentation for the test was deactivated on November 15 (T178500).

enwiki and dewiki A/B test v3 (December 12-)[edit]

The instrumentation was re-enabled on December 12 on enwiki and dewiki with the same parameter, to gather data for a newly added field about the time to first user link interaction (phab:T180036)

State diagram[edit]

A state diagram illustrating the various EventLogging events sent by this schema (revision 15906495)

Blacklisted pages[edit]

Per task T170169, Page Previews (Popups) can be configured to not load on a page. We refer to these pages as "blacklisted pages". If a page is blacklisted, it follows that the Page Previews instrumentation won't execute and we won't collect data regardless of whether we've been doing so (and will continue to do so) for the rest of the user's session.

369494 is the initial and current value of the blacklist. Phuedx (WMF) (talk) 09:27, 22 August 2017 (UTC)

Should ReferenceTooltips be logged in a separate schema[edit]

@Thiemo Kreuz (WMDE): - I see the schema has been updated to include ReferenceTooltips. I wonder if it might make sense to use a new schema, given you'll likely want to log reference tooltip previews separately from page previews. I'd imagine reference tooltips will display less often then link previews so if you are leaning heavily on the existing logging code, the sample size for reference tooltips is likely to be very small in comparison to link previews. Does it make sense to "fork" this schema and update the code so that it doesn't log link previews at all, or is data relating to link previews important for the questions you are trying to answer? Jdlrobson (talk) 18:37, 22 January 2019 (UTC)

Thanks for asking. These are all excellent questions we don't have an answer for, yet. I'm aware my edit does not have an effect until the code is updated to use the new revision ID. I just wanted to leave this here so we don't forget. I created phab:T214493 for us to keep track of this. --Thiemo Kreuz (WMDE) (talk) 17:18, 23 January 2019 (UTC)