Research:Modeling undisclosed paid editors

Created

16:36, 22 May 2020 (UTC)

Contact

Francesca Spezzano

Boise State

Collaborators

Nikesh Joshi

Boise State

TonyBallioni

Wikimedia

ST47

Wikimedia

DocJames

Wikimedia

Duration: 2020-05 – ??

Wikimedia supported

DC

Contact: Halfak

Research:Projects

This page documents a proposed research project.
Information may be incomplete and may change before the project starts.

We seek to build an automated system for detecting sock puppets accounts related to undisclosed paid editing (UPE) in Wikipedia. In past work, we used metadata signals to track UPE activity^[1]. In this study, we'll extend that work with linguistic features using deleted content of UPE activity.

Methods[edit]

We plan to compute a set of features describing how these accounts are behaving on Wikipedia. Examples of these features are average size of edits, average time between edits, and percentage of edits per Wikipedia namespace (e.g., Talk or User pages). Also, we will consider linguistic features that can be extracted from content of Wikipedian contributions that measure, for instance, the sentiment level of this text, or use of pronouns, punctuation, and specific keywords. Of course, in order to accurately compute these features, we need to access to the whole edit history of the considered accounts, hence we need to include deleted edits in the computation.

Timeline[edit]

Please provide in this section a short timeline with the main milestones and deliverables (if any) for this project.

Policy, Ethics and Human Subjects Research[edit]

It's very important that researchers do not disrupt Wikipedians' work. Please add to this section any consideration relevant to ethical implications of your project or references to Wikimedia policies, if applicable. If your study has been approved by an ethical committee or an institutional review board (IRB), please quote the corresponding reference and date of approval.

Results[edit]

Once your study completes, describe the results an their implications here. Don't forget to make status=complete above when you are done.

References[edit]

↑ Joshi, N., Spezzano, F., Green, M., & Hill, E. (2020, April). Detecting Undisclosed Paid Editing in Wikipedia. In Proceedings of The Web Conference 2020 (pp. 2899-2905). https://dl.acm.org/doi/abs/10.1145/3366423.3380055

[1] Joshi, N., Spezzano, F., Green, M., & Hill, E. (2020, April). Detecting Undisclosed Paid Editing in Wikipedia. In Proceedings of The Web Conference 2020 (pp. 2899-2905). https://dl.acm.org/doi/abs/10.1145/3366423.3380055

[1]

Methods[edit]

Timeline[edit]

Policy, Ethics and Human Subjects Research[edit]

See also[edit]

Results[edit]

References[edit]