WSoR datasets/user activity first msg
From Meta, a Wikimedia project coordination wiki
This builds off of Staeiou's user_first_msg dataset by adding details about editor activity before and after they received their first message.
Contents |
Location [edit]
db42:halfak.user_activity_first_msg
Fields [edit]
halfak@internproxy:~/data$ mysql -h db1047 -e "EXPLAIN user_activity_first_msg;SELECT * FROM user_activity_first_msg LIMIT 3" halfak +---------------------------------+------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +---------------------------------+------------------+------+-----+---------+-------+ | user_id | int(10) unsigned | YES | | NULL | | | user_name | varbinary(255) | YES | | NULL | | | namespace | int(11) | YES | | NULL | | | reverting_edits_before | decimal(23,0) | YES | | NULL | | | reverted_reverting_edits_before | decimal(23,0) | YES | | NULL | | | add_edits_before | decimal(23,0) | YES | | NULL | | | len_added_before | decimal(32,0) | YES | | NULL | | | reverted_add_edits_before | decimal(23,0) | YES | | NULL | | | reverted_len_added_before | decimal(32,0) | YES | | NULL | | | remove_edits_before | decimal(23,0) | YES | | NULL | | | len_removed_before | decimal(32,0) | YES | | NULL | | | reverted_remove_edits_before | decimal(23,0) | YES | | NULL | | | reverted_len_remove_before | decimal(32,0) | YES | | NULL | | | noop_edits_before | decimal(23,0) | YES | | NULL | | | reverted_noop_edits_before | decimal(23,0) | YES | | NULL | | | reverting_edits_after | decimal(23,0) | YES | | NULL | | | reverted_reverting_edits_after | decimal(23,0) | YES | | NULL | | | add_edits_after | decimal(23,0) | YES | | NULL | | | len_added_after | decimal(32,0) | YES | | NULL | | | reverted_add_edits_after | decimal(23,0) | YES | | NULL | | | reverted_len_added_after | decimal(32,0) | YES | | NULL | | | remove_edits_after | decimal(23,0) | YES | | NULL | | | len_removed_after | decimal(32,0) | YES | | NULL | | | reverted_remove_edits_after | decimal(23,0) | YES | | NULL | | | reverted_len_remove_after | decimal(32,0) | YES | | NULL | | | noop_edits_after | decimal(23,0) | YES | | NULL | | | reverted_noop_edits_after | decimal(23,0) | YES | | NULL | | +---------------------------------+------------------+------+-----+---------+-------+ +---------+-----------+-----------+------------------------+---------------------------------+------------------+------------------+---------------------------+---------------------------+---------------------+--------------------+------------------------------+----------------------------+-------------------+----------------------------+-----------------------+--------------------------------+-----------------+-----------------+--------------------------+--------------------------+--------------------+-------------------+-----------------------------+---------------------------+------------------+---------------------------+ | user_id | user_name | namespace | reverting_edits_before | reverted_reverting_edits_before | add_edits_before | len_added_before | reverted_add_edits_before | reverted_len_added_before | remove_edits_before | len_removed_before | reverted_remove_edits_before | reverted_len_remove_before | noop_edits_before | reverted_noop_edits_before | reverting_edits_after | reverted_reverting_edits_after | add_edits_after | len_added_after | reverted_add_edits_after | reverted_len_added_after | remove_edits_after | len_removed_after | reverted_remove_edits_after | reverted_len_remove_after | noop_edits_after | reverted_noop_edits_after | +---------+-----------+-----------+------------------------+---------------------------------+------------------+------------------+---------------------------+---------------------------+---------------------+--------------------+------------------------------+----------------------------+-------------------+----------------------------+-----------------------+--------------------------------+-----------------+-----------------+--------------------------+--------------------------+--------------------+-------------------+-----------------------------+---------------------------+------------------+---------------------------+ | 146308 | ! | 0 | 0 | 0 | 17 | 559 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 56 | 1099 | 0 | 0 | 1 | -35 | 0 | 0 | 3 | 0 | | 146308 | ! | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10 | 0 | 0 | 0 | 1 | 59 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | | 146308 | ! | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 641 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +---------+-----------+-----------+------------------------+---------------------------------+------------------+------------------+---------------------------+---------------------------+---------------------+--------------------+------------------------------+----------------------------+-------------------+----------------------------+-----------------------+--------------------------------+-----------------+-----------------+--------------------------+--------------------------+--------------------+-------------------+-----------------------------+---------------------------+------------------+---------------------------+
Each row represents a user who received a message posting on his/her talk page and the activity that user engaged by namespace.
user_id: The identifier of the useruser_name: The username of the user PRIMARY KEY PARTnamespace: The namespace in which activity is being reported. PRIMARY KEY PART- Activity before receiving the message
reverting_edits_before: The number of reverting editsreverted_reverting_edits_before: The number of reverting edits that were themselves reverted.add_edits_before: The number of edits that increase rev_lenlen_added_before: The sum total length added by add editsreverted_add_edits_before: The number of add edits that were reverted.reverted_len_added_before: The sum total length added by add edits that were revertedremove_edits_before: The number of edits that decrease rev_lenlen_removed_before: The sum total length removed by remove editsreverted_remove_edits_before: The number of remove edits that were revertedreverted_len_remove_before: The sum total length removed by remove edits that were revertednoop_edits_before: The number of edits that (net) make no change in rev_lenreverted_noop_edits_before: The number of noop edits that were reverted.
- Activity after receiving the message
reverting_edits_after: he number of reverting editsreverted_reverting_edits_after: The number of reverting edits that were themselves reverted.add_edits_after: The number of edits that increase rev_lenlen_added_after: The sum total length added by add editsreverted_add_edits_after: The number of add edits that were reverted.reverted_len_added_after: The sum total length added by add edits that were revertedremove_edits_after: The number of edits that decrease rev_lenlen_removed_after: The sum total length removed by remove editsreverted_remove_edits_after: The number of remove edits that were revertedreverted_len_remove_after: The sum total length removed by remove edits that were revertednoop_edits_after: The number of edits that (net) make no change in rev_lenreverted_noop_edits_after: The number of noop edits that were reverted.
Reproduction [edit]
Depends on:
To generate this dataset, run the following script:
CREATE TABLE halfak.user_activity_first_msg SELECT uf.user_id, uf.user_name, rlc.namespace, SUM( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NOT NULL ) AS reverting_edits_before, SUM( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NOT NULL AND rvtd.revision_id IS NOT NULL ) AS reverted_reverting_edits_before, SUM( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NULL AND len_change > 0 ) AS add_edits_before, SUM(IF( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NULL AND len_change > 0, len_change, 0 )) AS len_added_before, SUM( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NULL AND rvtd.revision_id IS NOT NULL AND len_change > 0 ) AS reverted_add_edits_before, SUM(IF( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NULL AND rvtd.revision_id IS NOT NULL AND len_change > 0, len_change, 0 )) AS reverted_len_added_before, SUM( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NULL AND len_change < 0 ) AS remove_edits_before, SUM(IF( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NULL AND len_change < 0, len_change, 0 )) AS len_removed_before, SUM( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NULL AND rvtd.revision_id IS NOT NULL AND len_change < 0 ) AS reverted_remove_edits_before, SUM(IF( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NULL AND rvtd.revision_id IS NOT NULL AND len_change < 0, len_change, 0 )) AS reverted_len_remove_before, SUM( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NULL AND len_change = 0 ) AS noop_edits_before, SUM( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NULL AND rvtd.revision_id IS NOT NULL AND len_change = 0 ) AS reverted_noop_edits_before, SUM( rlc.rev_timestamp > uf.msg_timestamp AND rvt.revision_id IS NOT NULL ) AS reverting_edits_after, SUM( rlc.rev_timestamp > uf.msg_timestamp AND rvt.revision_id IS NOT NULL AND rvtd.revision_id IS NOT NULL ) AS reverted_reverting_edits_after, SUM( rlc.rev_timestamp > uf.msg_timestamp AND rvt.revision_id IS NULL AND len_change > 0 ) AS add_edits_after, SUM(IF( rlc.rev_timestamp > uf.msg_timestamp AND rvt.revision_id IS NULL AND len_change > 0, len_change, 0 )) AS len_added_after, SUM( rlc.rev_timestamp > uf.msg_timestamp AND rvt.revision_id IS NULL AND rvtd.revision_id IS NOT NULL AND len_change > 0 ) AS reverted_add_edits_after, SUM(IF( rlc.rev_timestamp > uf.msg_timestamp AND rvt.revision_id IS NULL AND rvtd.revision_id IS NOT NULL AND len_change > 0, len_change, 0 )) AS reverted_len_added_after, SUM( rlc.rev_timestamp > uf.msg_timestamp AND rvt.revision_id IS NULL AND len_change < 0 ) AS remove_edits_after, SUM(IF( rlc.rev_timestamp > uf.msg_timestamp AND rvt.revision_id IS NULL AND len_change < 0, len_change, 0 )) AS len_removed_after, SUM( rlc.rev_timestamp < uf.msg_timestamp AND rvt.revision_id IS NULL AND rvtd.revision_id IS NOT NULL AND len_change < 0 ) AS reverted_remove_edits_after, SUM(IF( rlc.rev_timestamp > uf.msg_timestamp AND rvt.revision_id IS NULL AND rvtd.revision_id IS NOT NULL AND len_change < 0, len_change, 0 )) AS reverted_len_remove_after, SUM( rlc.rev_timestamp > uf.msg_timestamp AND rvt.revision_id IS NULL AND len_change = 0 ) AS noop_edits_after, SUM( rlc.rev_timestamp > uf.msg_timestamp AND rvt.revision_id IS NULL AND rvtd.revision_id IS NOT NULL AND len_change = 0 ) AS reverted_noop_edits_after FROM halfak.user_first_msg uf INNER JOIN halfak.rev_len_changed rlc ON uf.user_name = rlc.user_text LEFT JOIN halfak.revert_20110115 rvt ON rvt.revision_id = rlc.rev_id LEFT JOIN halfak.reverted_20110115 rvtd ON rvtd.revision_id = rlc.rev_id GROUP BY uf.user_name, rlc.namespace;
Notes [edit]
By adding up reverting_edits, add_edits, remove_edits and noop_edits, you'll get the total edits performed by a user.
By subtracting reverted_add_edits from add_edits you'll get the number of edits that were not reverted.