Research talk:Metrics

From Meta, a Wikimedia project coordination wiki

State of this page[edit]

User:DarTar, User:Renklauf, User:Junkie.dolphin and I are trying to organize and categorize the metrics (present and hypothetical) for measuring wiki activities. We're having a lot of off-wiki conversations about potential structure while we hash things out, but feel free to expand where you'd like.  :) --EpochFail (talk) 22:42, 24 July 2012 (UTC)[reply]

Notes[edit]

Some of these metrics are direct measurements of well defined events in Wikipedia while others are somewhat more complex (e.g. content persistence, qualitative assessment). It may be worth writing down for each of these how the measurement is carried out to be clearer about what the metric is. --Renklauf (talk) 01:39:57, 26 July 2012 (UTC)[reply]

It could be helpful to include some basic implementation details and data sourcing dependencies. We could build a simple API with hooks for new metrics and perhaps have this live in Git - do we have an existing repository for E3? The end goal here is to have an easy, centralized, and, well documented way to generate these metrics as we need them with a sufficient set of parameters. --Renklauf (talk) 01:39:57, 26 July 2012 (UTC)[reply]

I brought this up in our meeting in the 24th but it'd be great to start thinking of more sophisticated representations of metrics that leverage ML algorithms. There are several libraries out there that could do a lot of this out of the box for us (e.g. pybrain). Once Kraken is available we'll have a powerful system at our disposal that may be too tempting to not crunch these types of metrics that could tie together much more information about editors beyond simple counts. I think we'll probably need these types of metrics if we ever want to have a hope of making useful high-level inferences about editors/readers (recommender systems) and articles (classification). --Renklauf (talk) 01:39:57, 26 July 2012 (UTC)[reply]

A small note on the metrics which are computed during a particular month. I think it makes more sense to use a 30 or 31 day period. That is what I use at least. 216.38.130.167 23:54, 7 December 2012 (UTC)[reply]

Additional metrics[edit]

This is *awesome*, and I'm really excited to leverage this work for similar experiments being done in global development. Two things I was wondering why you chose not to include:

  1. Size of contributions (e.g., bytes, characters): it is possible this is what you mean by "scale of change," but I would have anticipated it under "Volume of Contribution"
  2. device of edits (e.g., mobile, tablet, PC): I could see some arguments for this being out of scope, but it also seems knowing HOW new users are conducting their edits is very important. Whether or not this breakdown is changing over time would have a lot of implications for the types of editor engagement experiments we would want to run.

Again - awesome. Really great to have this moving forward. Jwild (talk) 20:32, 17 August 2012 (UTC)[reply]

Compared to the other metrics currently here, device of edit is not currently available to track using the same databases. Otherwise, I agree Size of contributions should probably be added, even if it's a very imperfect measure of quality (reverts of page blankings are probably one of the highest "additions" in terms of bytes, for instance). Steven Walling (WMF) • talk 21:50, 17 August 2012 (UTC)[reply]

Namespace notes[edit]

All of the edit-related categories are for the main namespace, except live user and anonymous editor. Thus:

  1. There is no category for someone who clicked Edit on a main namespace article (similar to live user, but more specific).
    1. all metrics have been implemented as ns-agnostic, so we can tag users whether they clicked on a ns0 edit button or on any arbitrarily specified ns. The definitions in the category list are a still to some extent arbitrary (see the disclaimer at the top of the section) and subject to change. In fact, wherever we used the word "live account" in analysis or reports, I think we consistently referred to ns0-only edit attempts. I've updated the page to reflect this.
  2. There is no category for someone who has only main non-namespace edits.
    1. that's correct, I am not sure whether we want to promote these users to a separate category (unless we deem it worth high level reporting). There will be a necessary tradeoff between categories and more granular computed tags and most useful data will likely come from the latter.
  3. Anonymous editor is not main-namespace-specific, which is a bit inconsistent.
    1. you're right, fixed now.

I think the second is definitely potentially useful, and the first may be. Superm401 | Talk 21:20, 7 January 2013 (UTC)[reply]

Thanks for the great input, comments inline --DarTar (talk) 22:55, 7 January 2013 (UTC)[reply]
As long as the computed tags are flexible and granular, that should be sufficient. It should be relatively easy to promote a computed tag later, and e.g. say, _editor is now a Category X. However, people should be careful when the category and computed tag sound the same but aren't (which _editor/Editor is also an example of). Superm401 | Talk 01:15, 8 January 2013 (UTC)[reply]

Definition of new registered editors[edit]

Progressing excellently! The "Deprecated user classes" called out to me the gap in defining the middle phase of the conversion funnel: conversion of new registered users to experienced editors. Are you going to attempt any assessment in someone's progression on this front? I guess the question stems from wondering if there is value in being able to have standard language in the acquisition, conversion, and retention portions of the editor pipeline. Jwild (talk) 21:49, 8 January 2013 (UTC)[reply]