- Animesh Mukherjee
Affiliation or grant type
- IIT Kharagpur
- Animesh Mukherjee
- Measuring cultural biases across multi-lingual biographic articles
Description of the proposed project, including aims and approach. Be sure to clearly state the problem, why it is important, why previous approaches (if any) have been insufficient, and your methods to address it.
Starting with the goal of becoming the primary source of encyclopedic knowledge, Wikipedia and its sister projects have contributed massive content to the pool of free knowledge through almost 56 million articles in 305 languages, in which more than 160 languages are actively edited by thousands of Wikipedians regularly. As planned by Wikimedia’s 2030 strategic directions to achieve knowledge equity in different dimensions, broadly in content and communities that have been left out by the power structure and privileges, an important objective is to identify the knowledge gap created by the socio-cultural biases while framing Wikipedia biographies across different languages. By biographies, we refer to the Wikipedia articles on well-known personalities (living or dead). In this project, we plan to study whether geo-social-cultural biases exist in the portrayal of biographies across various wiki versions and if persist, how do they change over time.
In this proposal, we plan to investigate the impact of cultural biases in editorial practices manifesting across different languages as well as over different time points in biographies, With the help of AI and NLP techniques, we aim to answer the following research questions-
1. How much user-generated content, contributed by community editors is influenced by demography, social and cultural biases which may result in a differential portrayal of a biography in a language compared to its aggregate average representation?
2. Do these biases change over time and could they potentially reflect cultural changes in community editing practices?
3. How do these biases compare for resource-rich versus resource-poor languages?
We hypothesize that such geo-cultural biases in the days of the open internet opposes knowledge equity and promotes the inequitable distribution of resources and opportunities which contradicts Wikimedia’s strategic movements by 2030. Our research proposal will help the community to identify these knowledge gaps through a combination of quantitative and qualitative studies measuring cultural biases across languages and over time.
Approximate amount requested in USD.
- 40,000-50,000 USD
Briefly describe what you expect to spend money on (specific budgets and details are not necessary at this time).
Salary or stipend & benefits: 25,000 USD
Travel: 5000 USD
Software as a service: 5000 USD
Contingency: 5000 USD
Institutional overhead: 8000 USD
Total: 48000 USD
Address the impact and relevance to the Wikimedia projects, including the degree to which the research will address the 2030 Wikimedia Strategic Direction and/or support the work of Wikimedia user groups, affiliates, and developer communities. If your work relates to knowledge gaps, please directly relate it to the knowledge gaps taxonomy.
Our research proposal directly correlates with the key issue as identified by the 2030 Wikimedia strategic direction, the knowledge gap in contributors and content. The project aims to measure different contributors- a) whether they are influenced by the cultural biases and b) if so, to what extent. Regarding the content, the project will directly deal with the socio-cultural biases that manifests in the Wikipedia biography articles across languages—both resource-rich and low-resourced. Our research will facilitate the systematic survey in multilingual settings which can initiate community-based awareness for training the editors to contribute neutrally to the goal of bias-free but culturally enriched knowledge.
Plans for dissemination.
Upon successful completion of the project, we will submit it for publication in the top conference venues such as AAAI, SIGCHI, etc, and top-tier journals like IEEE TKDE, ACM TKDD, and ACM TWEB. Further, besides publishing our research work, we have plans to design dashboards for individual wikis which will ask the community editors to constantly be aware of the biases they might create while drafting the biographic articles.
Prior contributions to related academic and/or research projects and/or the Wikimedia and free culture communities. If you do not have prior experience, please explain your planned contributions.
Wikipedia, specifically the quality assessment of English Wikipedia has been my research topic for the last few years.
Following is the list of my publications-
Paramita Das, et al: Quality Change: Norm or Exception? Measurement, Analysis and Detection of Quality Change in Wikipedia. Proc. ACM Hum. Comput. Interact. 6(CSCW1).
Paramita Das, et al: When Expertise Gone Missing: Uncovering the Loss of Prolific Contributors in Wikipedia. ICADL 2021.
Bhanu Prakash Reddy Guda, et al: NwQM: A neural quality assessment framework for Wikipedia. EMNLP (1) 2020.
Soumya Sarkar, et al: StRE: Self Attentive Edit Quality Prediction in Wikipedia. ACL (1) 2019
I agree to license the information I entered in this form excluding the pronouns, countries of residence, and email addresses under the terms of Creative Commons Attribution-ShareAlike 4.0. I understand that the decision to fund this Research Fund application, the application itself along with all the information entered by my in this form excluding the pronouns, country of residences, and email addresses of the personnel will be published on Wikimedia Foundation Funds pages on Meta-Wiki and will be made available to the public in perpetuity. To make the results of your research actionable and reusable by the Wikimedia volunteer communities, affiliates and Foundation, I agree that any output of my research will comply with the WMF Open Access Policy. I also confirm that I have read the privacy statement and agree to abide by the WMF Friendly Space Policy and Universal Code of Conduct.