Jump to content

Research:Disinformation Literature Review

From Meta, a Wikimedia project coordination wiki
This page documents a completed research project.

The aim of this study is to find key areas of research that can be useful to fight against disinformation on Wikipedia. To address this problem we perform a literature review trying to answer three main questions:

  • What is disinformation?
  • What are the most popular mechanisms to spread online disinformation?
  • Which are the mechanisms that are currently being used to fight against disinformation?.

In all these three questions we take first a general approach, considering studies from different areas such as journalism and communications, sociology, philosophy, information and political sciences. And comparing those studies with the current situation on the Wikipedia ecosystem.

What is disinformation?


We found that disinformation can be defined as non-accidentally misleading information that is likely to create false beliefs. While the exact definition of misinformation varies across different authors, they tend to agree that disinformation is different from other types of misinformation, because it requires the intention of deceiving the receiver. A more actionable way to scope disinformation is to define it as a problem of information quality. In Wikipedia quality of information is mainly controlled by the policies of neutral point of view and verifiability.

A taxomony of (mis)information, based on Zhou and Zafrani's work
Authenticity Intention
Disinformation False Bad
Misinformation False Unknown
Mal-Information True Bad
Fake News False Bad
Satire News False Not Bad
Imposter Content False Unknown
Fabricated Content False Bad
Manipulated Content Unknown Bad
Rumor Unknown Unknown

The mechanisms used to spread online disinformation include the coordinated action of online brigades, the usage of bots, and other techniques to create fake content. Underresouced topics and communities are especially vulnerable to such attacks. The usage of sock-puppets is one of the most important problems for Wikipedia.

Social attacks classified by the type of weakness they exploit
Weakness exploited Description Example
Social System When reputation systems are hacked to introduce disinformation. Use bots or sock-puppets to over-represent an opinion or confirm a false information.
Lack of Information When the lack of information is used to introduce disinformation. Spread disinformation during on-going events like natural disasters or manipulate search engines results in topics without enough information.

Summary of the most popular mechanism to spread online disinformation.
Mechanism Description Type Vulnerability of Wikipedia
Bots Software used to automatize the spread of messages, generating the idea that of a lot people is given an specific opinion or interest about a topic Technical Low
Sock-puppets Multiple Online identities used for purposes of deception. Social High
Web Brigades A set of users coordinated to introduce fake content by exploiting the weakness of communities and systems. Social High
Click farms Where a large group of low-paid workers are hired to perform some micro-tasks to deceive online systems. Social Medium
Deepfake AI a technique for human image synthesis that can be used to create fake videos of celebrities or notable people. Technical Low
Data Voids Exploiting missing data to manipulate search results Social Medium
Circular reporting A situation where a piece of information appears to come from multiple independent sources, but in reality comes from only one source. Social High

Which are the mechanisms that are currently being used to fight against disinformation?


The techniques used to fight against information on the internet, include manual fact checking done by agencies and communities, as well as automatic techniques to assess the quality and credibility of a given information. Machine learning approaches can be fully automatic or can be used as tools by human fact checkers. Wikipedia and especially Wikidata play double role here, because they are used by automatic methods as ground-truth to determine the credibility of an information, and at the same time (and for that reason) they are the target of many attacks. Currently, the main defense of Wikimedia projects against fake news is the work done by community members and especially by patrollers, that use mixed techniques to detect and control disinformation campaigns on Wikipedia.



We conclude that in order to keep Wikipedia as free as possible from disinformation, it’s necessary to help patrollers to early detect disinformation and assess the credibility of external sources. More research is needed to develop tools that use state-of-the-art machine learning techniques to detect potentially dangerous content, empowering patrollers to deal with attacks that are becoming more complex and sophisticated.

Full Document


Full Paper on Arxiv

See Also