Jump to content

Schema talk:Edit

Add topic
From Meta, a Wikimedia project coordination wiki
Latest comment: 7 years ago by Halfak (WMF) in topic A proposal I wrote a while ago
Maintainer:James Forrester
Purge:Auto-purge just eventCapsule PII after 90 days, keep the rest indefinitely
Contains user-input text:Yes

This schema has been renamed[edit]

The new schema is at Schema:EditAttemptStep.

Inconsistent behaviour[edit]

Currently, the saveSuccess event are logged client-side for the visual editor and server-side for the wikitext editor. This has led to at least one inconsistency: for events where a new page is being created, wikitext events log the ID of the newly created page, but VE events do not.—Neil P. Quinn-WMF (talk) 02:35, 17 December 2016 (UTC)Reply

Reference material[edit]

  • Sampling rate as of January 20, 2017: 6.25% for all editors
  • Sampling rate changes
    • March 2015: enabled logging for the wikitext editor
    • 2015-04-14, 23:52: switched wikitext editor sampling from 100% to 25%.
    • 2015-06-22, 15:24: switched wikitext editor sampling from 25% to 6.25%.
    • 2016-02-09, 4:00: switched mobile web and visual editor sampling from 100% to 6.25%.
      • after this change, all the past data in the Edit_1347736 table was downsampled to give the table a consistent 6.25% sampling rate.

Tables in use[edit]

The following is a list of schema versions that were deployed and the minimum and maximum event timestamps in each corresponding table:

Table name Min timestamp Max timestamp
Edit_5563071 2013-06-14 18:34:42 2013-06-18 00:09:18
Edit_5570274 2013-08-14 00:57:35 2015-05-06 07:17:31
Edit_10604157 2014-11-26 20:35:19 2015-03-12 12:36:40
Edit_10676603 2014-12-03 21:11:05 2015-04-30 16:41:14
Edit_11319708 2015-02-19 23:06:35 2015-05-06 15:35:33
Edit_11448630 2015-03-16 15:08:59 2016-09-20 11:57:22
Edit_13457736 2015-09-07 21:42:39 [present, currently Jan 2017]

Notes and outstanding questions (r5664341)[edit]

  • I added a UA field, as a result of all the issues we had with browser support. This will allow us to replace speculations with data about browser-specific behavior.
  • are we planning to log anon activity? If we do, we should note that we're not currently logging IP addresses, so we cannot directly measure unique IP addresses
  • are we logging mobile activity? If we do we should flag it (unless it can be effectively filtered via the UA field) it's already being logged and can be filtered via webHost
  • we are still not logging (sampled) page impressions as we're not interested in measuring click-through/conversion rates
  • we are not planning to log view source events for pages that are not editable by the current user
  • we need to audit click logging as we saw anomalies in previously collected data: partly done, we were not logging clicks on New section click for one.
  • latency needs to be implemented for wikitext and its technical definition for wikitext needs to be documented in the field description
  • any discrepancy in the instrumentation between VE and wikitext for pageViewSessionId needs to be documented reviewed with Ori, no issue to be worried about here

Expanding to cover more edit-related events[edit]

This looks great! I have a few notes on wikitech, I read through the error cases in WikiPage and got overly exhaustive about mapping into the schema. Also, I have mixed feelings about adding columns for additional relevant revision IDs, I am sort of blocked by my ignorance—does it make analysis queries easier if you can get all the information you need out a single record without having to correlate editBegin to editComplete, for example? Adamw (talk) 23:03, 11 June 2014 (UTC)Reply

Editing description[edit]

@Dario (WMF): I just learned from Dan Andreescu that the schema description is incorrect. All events are logged client side, except that init and saveSuccess events for the wikitext editor are logged on the server. If I edit the description here, will that somehow mess with the schema versioning?—Neil P. Quinn-WMF (talk) 21:30, 22 July 2015 (UTC)Reply

@Neil P. Quinn-WMF: no, editing the schema won't affect data collection unless the specific (schema_id, rev_id) is deployed. --Dario (WMF) (talk) 17:01, 24 July 2015 (UTC)Reply

Device field[edit]

As discussed the device field is going to be misused. Already in https://gerrit.wikimedia.org/r/#/c/236244/15/resources/mobile.loggingSchemas/SchemaEdit.js it treats tablet edits as device = mobile. Desktop users will soon be able to use the Minerva skin too - do we really want to class them as editing from a mobile device? Please consider using browser width and skin instead. Jdlrobson (talk) 16:42, 9 September 2015 (UTC)Reply

A proposal I wrote a while ago[edit]

See https://etherpad.wikimedia.org/p/schema_edit

This proposal suggests breaking this schema into:

  • EditingSession (one per page edit session)
  • EditingStage (one per editing stage)
  • EditingAbort (one per aborted edit)
  • EditingSaveFailure (one per save failure)
  • PageContentSaveComplete (note that this schema already exists)

--Halfak (WMF) (talk) 01:53, 12 January 2017 (UTC)Reply