Research:Daily unique anonymous editors

From Meta, a Wikimedia project coordination wiki
Daily unique anonymous editors
Specification
A is an unregistered user who completed at least edits on date via the same IP address.
WMF Standard
  • = 1 edits
Status
completed
SQL
SET @date = "20140101";
SET @n = 1;

SELECT 
    COUNT(*) 
FROM (
    SELECT
        rev_user_text,
        SUM(revisions) AS revisions
    FROM (
        SELECT
            rev_user_text,
            COUNT(*) AS revisions
        FROM revision
        WHERE
            rev_timestamp BETWEEN @date AND
                DATE_FORMAT(DATE_ADD(@date, INTERVAL 1 DAY), "%Y%m%d%H%i%S") AND
            rev_user = 0 
        GROUP BY 1
        UNION
        SELECT
            ar_user_text AS rev_user_text,
            COUNT(*) AS revisions
        FROM archive
        WHERE
            ar_timestamp BETWEEN @date AND
                DATE_FORMAT(DATE_ADD(@date, INTERVAL 1 DAY), "%Y%m%d%H%i%S") AND
            ar_user = 0 
        GROUP BY 1
    ) AS user_revisions
    GROUP BY 1
) AS editors
WHERE revisions >= @n;

Daily unique anonymous editors is a standardized metric used to measure the number of logged-out editors who save edits to a wiki on a given day. It's used as a proxy for editing population size.

Discussion[edit]

Using IP as an identifier[edit]

The current metric depends on counting IP addresses within the specified period as a proxy for distinct anonymous editors. A unique IP address doesn't necessarily identify a unique user due to IP rotation, IP addresses shared among multiple editors, proxies etc.

Time lag[edit]

As this is a daily metric, a full 24 hours must elapse after the beginning of the date (UTC) in order to calculate an uncensored value.

Edits on deleted pages[edit]

This metric includes edits on existing pages as well as pages that have been or will later be deleted. This allows us to define a metric as stateless, in other words historical values will not change in the future depending on the status of a page (existing/deleted/moved) at the time the metric is computed. Deletion-related activity is tracked via a separate set of metrics.

Analysis[edit]

Discussion[edit]

Notes[edit]