Research:Sockpuppet detection in Wikimedia projects/Formative research

From Meta, a Wikimedia project coordination wiki
Tracked in Phabricator:
Task T171251
Created
Contact
Srijan Kumar
Jure Leskovec
Duration:  2017-September – 2020-
This page documents a completed research project.


Introduction[edit]

Sockpuppetry is the use of more than one account on a social platform. The reasons for sockpuppetry may be benign, such as using one account for work and one for personal use, or may be malicious, such as pushing one's point of view. On English Wikipedia specifically, the benign and malicious uses are defined here. Sockpuppets are frequently used in a large-scale coordinated way to create undesirable articles or own point of view into the articles. A couple of recent cases show the sophistication involved in their operation, for example in OrangeMoody case and Morning277 case.

In this project, we aim to design strategies to identify potential sockpuppet accounts on English Wikipedia, using machine learning algorithms. The aim is to develop high precision detection models using previously identified sockpuppets, which are publicly listed in this category.

The expected outcome of this project is a set of open algorithmic methods, and a report on their performance and limitations, that could be integrated later on into tools to support community efforts to identify and flag sockpuppet accounts.

Short literature review[edit]

Research on the use of sockpuppets on Wikipedia has been performed using the publicly available data from sockpuppet investigations and blocked accounts. One of the first works on Wikipedia sockpuppetry was performed by Solorio et al.[1], by using the comments made by sockpuppets on talk pages. Extensive stylistic analysis was performed on this data, and it achieved an F-1 score of 0.72. Solorio et al.[2] also released a dataset of sockpuppet comments on talk pages. As editors can make edits on Wikipedia articles and on talk pages, combined analysis needs to be performed. Tsikerdekis et al. [3] perform aggregate analysis of 7,500 sockpuppet accounts from Wikipedia, and contrast this behavior with 7500 legitimate accounts. Non-textual features are used for the analysis. However, aggregate analysis does not reveal insights into account-level behavior of sockpuppets, for example, how sockpuppets are actually used. Yamak et al.[4] studied Wikipedia 5,000 sockpuppet accounts among 120,000 sockpuppet accounts by creating Wikipedia specific features, including number of edits, reversion, time between registration and edits. Linguistic analysis, which has been shown to be very indicative of sockpuppetry[5], was not performed. Zheng et al.[6] performed sockpuppet analysis by splitting activity by the same user into that of two different accounts and trying to tie them back together. This assumes that sockpuppets write similarly using their multiple accounts.

Data[edit]

Data used for the purpose of this project includes, but is not limited to, edit and user data. Edit information includes edits on public as well as deleted pages, and edits by active and deleted (e.g. banned) users. Information such as the revision text, timestamp, user ID, username, IP address, and user agent may be used. Additional data sources and fields may be added along the course of this research and will be reported on this page. All nonpublic data (e.g. IP addresses) will be handled in accordance to the Wikimedia Foundation's privacy policy, data retention guidelines, and formal collaboration policy.

Lists of declared multiple account on Wikipedia include: en:Category:Alternative_Wikipedia_accounts and en:Category:Wikipedians_with_alternative_accounts.

Methodology[edit]

Qualitative data[edit]

We'll collect qualitative data from editors with CheckUser access engaging in sockpuppet detection, to identify strategies and workflows they follow, as well as potential sources of signal and feature candidates for our modeling efforts.

Modeling[edit]

Here we briefly describe the proposed machine learning algorithm. We are using deep learning model that aims to learn a low dimentional representation (called embedding) of each account, based on its edit history, such that sockpuppet accounts of the same person have similar embeddings. The edit history includes information related to the sequence of edits that the account makes, such as the page edited, the exact text added and removed, the time of the edit, the users who edited the page previously, relation between two consecutively edited pages (e.g., the hyperlink distance between the two), and so on. Instead of creating hand-crafted features that will encode these, we will create a recurrent neural network (RNN) model that will generate the embeddings with the optimization function to reduce error during training. The training phase includes positive examples, i.e., pairs of accounts that belong to the same user, and negative example, i.e., pair of random accounts or pair of sockpuppet accounts of different users. After successful training, the positive training examples will be embedded closer in the embedding space compared to the pairs in negative training example.

There are two major steps that need to be done in sequence to identify sockpuppets:

  1. Step 1: the algorithm needs to identify whether an account is malicious or not. This is done primarily to reduce the search space in step 2 (i.e., the pair of accounts to compare).
  2. Step 2: among the accounts that are identified as malicious in step 1, find all sets of accounts that are likely to be sockpuppets, i.e., being operated by the same editors. This is done by finding pairs and then merging the identified pairs into bigger sets.

Model 1: Predicting whether a user will be blocked

Dataset description: To conduct step 1, we obtain a publicly available list of all accounts that have been blocked (for any reason) from the Wikimedia API. This gives a total of around 1.03 million blocked accounts, from a total of 43 million accounts. Using the Wikimedia edit history dump from August 2017, we extract all the edits made by these blocked accounts. To train a supervised machine learning model, we would require a dataset with both positive and negative training examples. Therefore, we randomly sample an equal number of non-blocked user accounts via a matching process, such that for each blocked editor, there is a non-blocked editor with similar total lifetime (i.e., the number of seconds from first to last edit) and total number of edits. We also extract edit histories for each of these matched non-blocked editors. For all editors in these sets, we only consider information from their first 10 edits, as our goal is to identify potentially malicious editors as early as possible. This way, we have a balanced dataset of malicious and non-malicious accounts, that are similar in properties.

The next step is to build a classifier that predicts whether an account should be blocked, given their edit history. Our model uses public data from editor and article revision histories, and leverages not only information from the text of the edit, but also from the sequence of edits that are made, and the relation between pages edited by the same user account.

Method description:

To extract a meaningful relation between Wikipedia pages that are edited by any given editor in our set, we use the Wikipedia hyperlink network [7]. This network consists of all Wikipedia article pages represented as nodes and edges between these pages represent their connecting hyperlinks. We create an N-dimensional representation of each page in this Wikipedia network, by a standard embedding approach such that the cosine similarity between the N-dimensional representations of two pages represent how similar they are in the hyperlink network. Therefore, two pages that are very close would have similar representations, while those that are far would have lower cosine similarity. We set N to 50, and train the model using pytorch. This gives us a fixed-length vector representation of each Wikipedia page.

Next, to leverage the text of the edit an editor makes, we find the words that the editor adds and removes in each edit. First, for each edit, we found the word added and removed from the 08/2017 edit dump. Then, we use a pre-trained GloVe vector representation of these words (size=50)[8] and generate two vectors for each edit: the average vector representation of words that were added, and another one for words that were removed. These GloVe embeddings are trained on Wikipedia article data, which has the advantage of having representations for Wikipedia-specific words as well. This therefore gives us a 100 dimensional representation of the text of each edit.

Finally, we consider basic statistics of each editor’s contribution history, such as the total number of edits, (mean, maximum, median, and distribution of) the time difference between two consecutive edits, the number of unique pages edited, the maximum number of edits made on an article, and so on.

Experimental results:

In all our experiments, we run a standard 10-fold cross-validation approach, and measure model performance using the accuracy metric. Note that as the dataset is balanced, a random selection would give a 0.50 accuracy in this prediction task (higher is better).

We create a baseline feature-based model (model A) that takes all the above explained features---page embedding, text embedding, and statistics---to create a feature vector, averages it across all edits that an editor makes, and feeds it into a random forest classifier model for the balanced dataset. This model achieves slightly better performance than random, and has an accuracy of 0.58.

This model does not perform well as it takes an average of all edit information, thereby losing individual and informative features, and information on the time of the edit.

We then create an LSTM model (model B), which takes the sequence of edit actions into account. This LSTM model takes an ordered sequence of feature vector of edits as input and outputs the probability of it being malicious or not. This model performs slightly better with an accuracy of 0.62.

Finally, we create a novel deep learning model (model C), where we train multiple identical LSTM models in parallel and combine their outputs, using a function dependent on the inter-edit time distribution. The function maps the inter-edit time distribution to a learned weight distribution, that multiplies with the LSTM outputs and aggregates them together. This way, the different LSTMs learn properties of editors that have a similar inter-edit time distribution [9], with the intuition being that malicious users have similar latter distribution. Even with 2 LSTMs, this model achieves an accuracy of 0.72. Because of this performance, will use the outputs of this model for identifying sockpuppets.

Model 2: Prediction of sockpuppets.

We will create efficient deep learning models, following our novel multi-LSTM model, for predicting sockpuppets. This will be done in the coming months.

Results and evaluation[edit]

For Model 1: Predicting whether a user will be blocked, the results are mentioned above.

Feedback on the scope and design of this research project are welcome on its talk page.

References[edit]

  1. Solorio, Thamar; Hasan, Ragib; Mizan, Mainul (2013). A Case Study of Sockpuppet Detection in Wikipedia. Workshop on Language Analysis in Social Media (LASM) at NAACL HLT. pp. 59–68. 
  2. Solorio, Thamar; Hasan, Ragib; Mizan, Mainul (2013-10-24). "Sockpuppet Detection in Wikipedia: A Corpus of Real-World Deceptive Writing for Linking Identities". arXiv:1310.6772 [cs]. 
  3. Tsikerdekis, M.; Zeadally, S. (August 2014). "Multiple Account Identity Deception Detection in Social Media Using Nonverbal Behavior". IEEE Transactions on Information Forensics and Security 9 (8): 1311–1321. ISSN 1556-6013. doi:10.1109/tifs.2014.2332820. 
  4. Yamak, Zaher; Saunier, Julien; Vercouter, Laurent (2016). "Detection of Multiple Identity Manipulation in Collaborative Projects". Proceedings of the 25th International Conference Companion on World Wide Web. WWW '16 Companion (Republic and Canton of Geneva, Switzerland: International World Wide Web Conferences Steering Committee): 955–960. ISBN 9781450341448. doi:10.1145/2872518.2890586. 
  5. Kumar, Srijan; Cheng, Justin; Leskovec, Jure; Subrahmanian, V.S. (2017). "An Army of Me: Sockpuppets in Online Discussion Communities". Proceedings of the 26th International Conference on World Wide Web. WWW '17 (Republic and Canton of Geneva, Switzerland: International World Wide Web Conferences Steering Committee): 857–866. ISBN 9781450349130. doi:10.1145/3038912.3052677. 
  6. Zheng, Xueling, Yiu Ming Lai, K. P. Chow, Lucas CK Hui, and S. M. Yiu. Detection of sockpuppets in online discussion forums. PhD disseration, University of Hong Kong, 2011.
  7. https://snap.stanford.edu/data/enwiki-2013.html
  8. Pennington, Jeffrey, Richard Socher, and Christopher Manning. "Glove: Global vectors for word representation." In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532-1543. 2014.
  9. Halfaker, Aaron; Keyes, Oliver; Kluver, Daniel; Thebault-Spieker, Jacob; Nguyen, Tien; Shores, Kenneth; Uduwage, Anuradha; Warncke-Wang, Morten (2015-05-18). "User Session Identification Based on Strong Regularities in Inter-activity Time". International World Wide Web Conferences Steering Committee. pp. 410–418. ISBN 9781450334693. doi:10.1145/2736277.2741117.