This schema has been renamed
The new schema is at Schema:EditAttemptStep.
Currently, the saveSuccess event are logged client-side for the visual editor and server-side for the wikitext editor. This has led to at least one inconsistency: for events where a new page is being created, wikitext events log the ID of the newly created page, but VE events do not.—Neil P. Quinn-WMF (talk) 02:35, 17 December 2016 (UTC)
- Sampling rate as of January 20, 2017: 6.25% for all editors
- Sampling rate changes
- March 2015: enabled logging for the wikitext editor
- 2015-04-14, 23:52: switched wikitext editor sampling from 100% to 25%.
- 2015-06-22, 15:24: switched wikitext editor sampling from 25% to 6.25%.
- 2016-02-09, 4:00: switched mobile web and visual editor sampling from 100% to 6.25%.
- after this change, all the past data in the Edit_1347736 table was downsampled to give the table a consistent 6.25% sampling rate.
Tables in use
The following is a list of schema versions that were deployed and the minimum and maximum event timestamps in each corresponding table:
|Table name||Min timestamp||Max timestamp|
|Edit_5563071||2013-06-14 18:34:42||2013-06-18 00:09:18|
|Edit_5570274||2013-08-14 00:57:35||2015-05-06 07:17:31|
|Edit_10604157||2014-11-26 20:35:19||2015-03-12 12:36:40|
|Edit_10676603||2014-12-03 21:11:05||2015-04-30 16:41:14|
|Edit_11319708||2015-02-19 23:06:35||2015-05-06 15:35:33|
|Edit_11448630||2015-03-16 15:08:59||2016-09-20 11:57:22|
|Edit_13457736||2015-09-07 21:42:39||[present, currently Jan 2017]|
Notes and outstanding questions (r)
- I added a UA field, as a result of all the issues we had with browser support. This will allow us to replace speculations with data about browser-specific behavior.
- are we planning to log anon activity? If we do, we should note that we're not currently logging IP addresses, so we cannot directly measure unique IP addresses
are we logging mobile activity? If we do we should flag it (unless it can be effectively filtered via the UA field)it's already being logged and can be filtered via webHost
- we are still not logging (sampled) page impressions as we're not interested in measuring click-through/conversion rates
- we are not planning to log view source events for pages that are not editable by the current user
- we need to audit click logging as we saw anomalies in previously collected data: partly done, we were not logging clicks on New section click for one.
- latency needs to be implemented for wikitext and its technical definition for wikitext needs to be documented in the field description
any discrepancy in the instrumentation between VE and wikitext for pageViewSessionId needs to be documentedreviewed with Ori, no issue to be worried about here
This looks great! I have a few notes on wikitech, I read through the error cases in WikiPage and got overly exhaustive about mapping into the schema. Also, I have mixed feelings about adding columns for additional relevant revision IDs, I am sort of blocked by my ignorance—does it make analysis queries easier if you can get all the information you need out a single record without having to correlate editBegin to editComplete, for example? Adamw (talk) 23:03, 11 June 2014 (UTC)
@Dario (WMF): I just learned from Dan Andreescu that the schema description is incorrect. All events are logged client side, except that init and saveSuccess events for the wikitext editor are logged on the server. If I edit the description here, will that somehow mess with the schema versioning?—Neil P. Quinn-WMF (talk) 21:30, 22 July 2015 (UTC)
- @Neil P. Quinn-WMF: no, editing the schema won't affect data collection unless the specific (schema_id, rev_id) is deployed. --Dario (WMF) (talk) 17:01, 24 July 2015 (UTC)
As discussed the device field is going to be misused. Already in https://gerrit.wikimedia.org/r/#/c/236244/15/resources/mobile.loggingSchemas/SchemaEdit.js it treats tablet edits as device = mobile. Desktop users will soon be able to use the Minerva skin too - do we really want to class them as editing from a mobile device? Please consider using browser width and skin instead. Jdlrobson (talk) 16:42, 9 September 2015 (UTC)
A proposal I wrote a while ago
This proposal suggests breaking this schema into:
- EditingSession (one per page edit session)
- EditingStage (one per editing stage)
- EditingAbort (one per aborted edit)
- EditingSaveFailure (one per save failure)
- PageContentSaveComplete (note that this schema already exists)