Grants:Programs/Wikimedia Research Fund/Multimodal Mis/Dis-information Detection Using NLP/ML for Wikimedia

From Meta, a Wikimedia project coordination wiki
statusnot funded
Multimodal Mis/Dis-information Detection Using NLP/ML for Wikimedia
start and end datesJuly 2022 - June 2023
budget (USD)40,000-50,000 USD
applicant(s)• Tirthankar Ghosal, Cornelia Caragea



Applicant's Wikimedia username. If one is not provided, then the applicant's name will be provided for community review.

Tirthankar Ghosal, Cornelia Caragea

Project title

Multimodal Mis/Dis-information Detection Using NLP/ML for Wikimedia

Entity Receiving Funds

Provide the name of the individual or organization that would receive the funds.

Tirthankar Ghosal

Research proposal[edit]


Description of the proposed project, including aims and approach. Be sure to clearly state the problem, why it is important, why previous approaches (if any) have been insufficient, and your methods to address it.

In this project, using NLP/ML, we propose to investigate one particular category of misinformation that is straightforward to generate, easy to believe or convince, probably the hardest to detect, and quite prevalent in present-day news and social media. We focus on that category of misinformation where real multimedia content is shared with a coherent synthetic text in a different context to generate fake information. E.g., an image/video of a riot in one place is spread in another turbulent area with fake text content to incite the audience. We deem that without having background knowledge of the multimedia content, it is very challenging to detect misinformation of this category. We plan to conduct a randomized controlled trial with university students (literate population) and everyday citizens (general population) in India/US. The image-caption pairs presented to the participants will consist of actual misinformation instances (news, social media posts) and synthetic instances where the multimedia content is real but the textual narrative is fake. The idea is to see how informedness and non-informedness (prior exposure to the content) play a role in spotting such misinformation among different populations. This will serve to build a dataset for instances where changes in textual context may portray a given multimedia content in a different light and are frequently used for misinformation propagation. Further, we will develop a pipelined neural reverse image retrieval architecture to retrieve the associated text for the multimedia item. Once the multimedia content is retrieved, we will use a siamese network to find if the candidate image and retrieved ones are the same. If they are, then the candidate image/video has already appeared before. Next, we investigate if the associated texts are semantically equivalent. If so, they are consistent and convey true information; if not, the candidate text is inconsistent with the earlier narrative, which may signify manipulated text content. To determine semantic equivalence, we will use a hybrid approach that considers both the entities or keywords present in both texts and the semantic similarity at the sentence embedding level. We will use our recent textual novelty detection algorithms for inconsistency/anomaly detection between the candidate text and retrieved image-associated text. To the best of our knowledge, no existing system attempts to comprehend this particular category of misinformation.


Approximate amount requested in USD.


Budget Description

Briefly describe what you expect to spend money on (specific budgets and details are not necessary at this time).

Tentatively the associated costs would be:
  • 1. Two Research Interns (from India) - 14000 USD
  • 2. P.I. Remuneration - 24000 USD (1000 USD per month for two P.I.’s for two hours of daily time)
  • 3. Amazon AWS/Google Colab Costs - 6,000 USD
  • 4. Annotator Costs - 7000 USD
  • 5. Survey Costs - 5000 USD

Personnel, Annotators would be mostly hired from India.


Address the impact and relevance to the Wikimedia projects, including the degree to which the research will address the 2030 Wikimedia Strategic Direction and/or support the work of Wikimedia user groups, affiliates, and developer communities. If your work relates to knowledge gaps, please directly relate it to the knowledge gaps taxonomy.

Identifying mis/dis-information, forms one of the strategic directions of Wikimedia for 2030. As a trustable source of information, Wikimedia must employ “early detection and prevention” strategies to counter mis/dis-information. NLP/ML methods are an obvious choice to mine the wealth of text and multimodal data to automatically flag documents (and particular sections) that may have misinformation. We do not envisage a system that could be full-proof but provide a first-layer of pruning/flagging (misinformation detection in NLP is hard due to its subtle nuances). Much of Wiki data is multimodal, hence developing multimodal models focussing on the particular category of misinformation stated above would be helpful for the Wiki ecosystem.


Plans for dissemination.

All the research carried out or data built as part of this work would be open sourced for the community via public Github projects. We would document the progress (timelines, milestones, minutes, etc.) in a public Wiki project. We would publish the research output in top-tier open access AI/NLP/IR conferences which includes AAAI, IJCAI, ACL, NAACL, SIGIR, EMNLP, CIKM, WWW, etc. We would also showcase our research in the Wikimedia events.

Past Contributions[edit]

Prior contributions to related academic and/or research projects and/or the Wikimedia and free culture communities. If you do not have prior experience, please explain your planned contributions.

Kindly visit the author profiles. Some very relevant ones:
  • 1. Kumari, R., Ashok, N., Ghosal, T., & Ekbal, A. (2022). What the fake? Probing misinformation detection standing on the shoulder of novelty and emotion. Information Processing & Management, 59(1), 102740.
  • 2. Kumari, R., Ashok, N., Ghosal, T., & Ekbal, A. (2021). Misinformation detection using multitask learning with mutual learning for novelty detection and emotion recognition. Information Processing & Management, 58(5), 102631.
  • 3. Kumari, R., Ashok, N., Ghosal, T., & Ekbal, A. (2021, July). A Multitask Learning Approach for Fake News Detection: Novelty, Emotion, and Sentiment Lend a Helping Hand. In 2021 International Joint Conference on Neural Networks (IJCNN) (pp. 1-8). IEEE.

I agree to license the information I entered in this form excluding the pronouns, countries of residence, and email addresses under the terms of Creative Commons Attribution-ShareAlike 4.0. I understand that the decision to fund this Research Fund application, the application itself along with all the information entered by my in this form excluding the pronouns, country of residences, and email addresses of the personnel will be published on Wikimedia Foundation Funds pages on Meta-Wiki and will be made available to the public in perpetuity. To make the results of your research actionable and reusable by the Wikimedia volunteer communities, affiliates and Foundation, I agree that any output of my research will comply with the WMF Open Access Policy. I also confirm that I have read the privacy statement and agree to abide by the WMF Friendly Space Policy and Universal Code of Conduct.