WSoR datasets/bot

From Meta, a Wikimedia project coordination wiki

The bot table has been constructed from a union of the bot status page, the list of bots by number of edits and a scan of the user_groups table for the "bot" group and a through accidental discovery (often because they show up in a plot as an outlier).

The purpose of this table is to allow simple flagging (and often removal) of bot activities when performing analyses.

Location[edit]

db42:halfak.bot_20110711

Fields[edit]

halfak@internproxy:~$ mysql -h db42 -e "EXPLAIN bot_20110711;SELECT * FROM bot_20110711 LIMIT 3" halfak
+---------+------------------+------+-----+---------+-------+
| Field   | Type             | Null | Key | Default | Extra |
+---------+------------------+------+-----+---------+-------+
| user_id | int(11) unsigned | NO   | MUL | 0       |       |
+---------+------------------+------+-----+---------+-------+
+---------+
| user_id |
+---------+
|       0 |
|     285 |
|    6120 |
+---------+

Each row represents a user. There should be a row in this table for every user_id used by a bot.

  • user_id: The identifier of a row. The user identifier from user.user_id.

Reproduction[edit]

  1. Gather usernames from bot status and list of bots by number of edits.
    1. Join usernames to user table to acquire user_ids
  2. Union with SELECT DISTINCT ug_user FROM user_groups WHERE ug_group = "bot";

Notes[edit]

This table was last updated on the 11th of July, 2011. New bots that were created since then will need to be added for analyses of data after that date.