Jump to content

Research:Understanding Web Content Monetization in the Ho and Santali Languages

From Meta, a Wikimedia project coordination wiki
13:52, 19 June 2021 (UTC)
Duration:  2021-June – 2022-March
Ho, Santali, Adivasi, Indigenous, India, Media

This page is an incomplete draft of a research project.
Information is incomplete and is likely to change substantially before the project starts.

The broader scholarship in anthropology has provided some context to how the socioeconomic and linguistic hierarchy of the Indian subcontinent have created access and participation barriers for many marginalized communities, especially the indigenous Adivasi peoples who have further been discriminated against by the neighboring dominant language speakers. In theory and at the outset, the Distributed Ledger Technology (DLT) is often touted to promote open and distributed user sovereignty, wider access to information and fair content monetization. But the steep growth of Web3, the upcoming iteration of the World Wide Web and the foundational layer over which DLT is placed, also pushes us to ask a range of critical questions: "what kind of foundational infrastructures exist for the indigenous web content ecosystems?", "what are the potential repercussions of any such tech rollout on the lives of the indigenous people from a labor, environmental and linguistic standpoint?", "what do we imagine as the future of web monetization for indigenous languages based on our own study?", and most importantly, "what are the potential repercussions of any such tech rollout on the lives of the indigenous people from a labor, environmental and linguistic standpoint?". Based on observational research through surveys and interviews, we are studying such foundational issue areas in two Adivasi (indigenous) languages from India, Ho and Santali. We detail here through a self-assessment framework and a set of recommendations the critical considerations that are required before active deployment of DLT-based transactions for web content in indigenous languages. Impact on indigenous rights, labor, and the environment are some of the critical areas that we have studied will help form these considerations. We also caution against the profit-driven implementation of Web3 and argue in this paper that the "decentralization" effect of Web3 might not translate well into user sovereignty but might further existing forms of domination and oppression.



To flag the present and real issue around decision-making in design processes, the Design Justice Network Principles underlines, "the people who are most adversely affected by design decisions — about visual culture, new technologies, the planning of our communities, or the structure of our political and economic systems — tend to have the least influence on those decisions and how they are made" [1][2]. To put this in the context of any technological development in an internet domain, design decisions are often made in isolation and they primarily benefit a decision-maker (DM). As the internet technology as a sector is heavily private industry and state driven, decisions tend to discriminate to a great degree the gender and ethnic minority individuals and people with disabilities among others. Prominent AI researcher Timnit Gebru elucidates, "the dominance of those who are the most powerful race/ethnicity in their location (e.g. White in the US, ethnic Han in China, etc.), combined with the concentration of power in a few locations around the world, has resulted in a technology that can benefit humanity but also has been shown to (intentionally or unintentionally) systematically discriminate against those who are already marginalized" [3]. Gebru's work and other existing literature collectively suggest that the power aggregation on one side for the DMs, who are the most privileged in socioeconomic and other forms privileges as compared to the actual users, make them even more privileged. This phenomenon leads to the least access to decision-making for the users on the other side. Most importantly, the broader research also indicates that DMs do not generally factor into account the users who will be affected the most by the deployment of such technology or seek input during taking the decision decisions. This is considering the fact that users themselves still do not play the role of an active constituent in the decision-making process behind most things on the internet, and especially the technological innovations related to Web3 (the furure iteration of the World Wide Web [4]) such as blockchain or cryptocurrency. While involving real users can expand the philosophical goal of "Nothing About Us Without Us" [5], the ideals of equitable design still lies in imagining user ownership rather than mere user participation. The United Nations-led Multistakeholder Internet Governance model recommends internet-related policymaking to be "developed by the governments in consultation with all stakeholders" [6]. While this model is far from being inclusive primarily because of state and the private industry dominating the policymaking process, the key policies around Web3 are yet to be framed and detailed in most parts of the world. We also identify that there is a huge gap between internet policies and the implementation of the same. Noted Black scholar Patricia Hill Collins provides important insights on societal power dynamics in the context of Black women in the US while conceptualizing "Matrix of Domination" [7]. She frames race, class, and gender as the three interlinked systems that shape the lives of Black women. In the book "Design Justice", Sasha Costanza-Chock elaborates how Collins’ framing are equally relevant to the study equitable and inclusive design [8]. We see the inequity of everything related to the internet as a clear manifestation of the systemic and societal imbalance of power dynamics, and how dominant groups continue to oppress the indigenous peoples, women and LGBTQ+ people, oppressed castes, people with disabilities and other marginalized groups, as some of these are highlighted by Gebru. Decision-making in the technology industry, especially around the applications of the internet, is often found to be driven merely by market profit trends. However, rolling out new technology is also found to be disconnected from the ground reality and the real users. For instance, the Distributed Ledger Technology (DLT) is often touted to be an open, secure and decentralized technology [9]. However, new revelations signal a more objective study. Signal founder Matthew "Moxie" Rosenfeld argues that decentralization is neither of practical nor critical importance to the majority of users [10]. From an environmental standpoint, cryptocurrency mining requires high energy consumption [11]. Hence, in places where the majority of energy is produced from fossil fuel-based plants, the direct and massive impact of such power consumption leads to significantly more environmental damage than usual. Further study on the impact of DLT-based systems on indigenous peoples and their native indigenous lands are needed when rollout of such emerging technology is needed. In this paper, we have created a framework for any decision-maker (DM) of web content monetization systems using DLT based on a range of factors. In our ongoing study of the web content ecosystems of two indigenous languages primarily spoken in eastern India, Ho and Santali, we contrast the status quo of web content in these two languages with the web content monetization systems that are powered by DLT. Ho and Santali are spoken respectively by the Ho and Santal communities. The Ho and Santal peoples are a part of broad group called Adivasi that is formed of over 700 social and ethnolinguistic indigenous groups. The term "Adivasi" was coined as a result of a political movement in the 1930s in India, and the legal and constitutional equivalent term is "Scheduled Tribe" [12]. Based on observational data, we have argued here why DMs need to assess thoroughly using labor, linguistic and environmental lenses before their future deployments of platforms using cryptocurrency for the audience to contribute to the content creators. We have also shared our interim recommendations to answer the following key questions:

  • What kind of foundational infrastructures exist for the Ho and Santali web content ecosystems, and how do different factors affect them?
  • How potential DLT-based web monetization systems might impact Ho, Santali and other Adivasi languages, and the native speakers?
  • What do the observed issues tell us about the future of web monetization for Adivasi languages?
  • How do we translate the observations into fair and equitable practices for the decision-makers in the DLT sector?
  • What are the potential repercussions of any such tech rollout on the lives of the indigenous people from a labour, environmental and linguistic standpoint?

We take a three-pronged approach in our study: conducting an analysis of the existing content ecosystem in the Ho and Santali languages through a DLT lens, building a framework for self-assessment by the DMs, and creating a set of recommendations that might be relevant for any future work in this area. We discuss in this paper the power dynamics between the physical and the digital societies that affect design decisions in technology, and how those dynamics set the stage for the future of financial transactions related to web content. We frame in this paper a set of questions for the DMs of new DLT-enabled platforms based on available qualitative and observational data and have prepared a set of interim recommendations based on this data. We have also launched two surveys for the web content creators who speak and use either of the two target languages.



Please provide in this section a short timeline with the main milestones and deliverables (if any) for this project.

Phase 1

Outline of planned deliverables
Duration Work details Status / To be done
June - Nov 2021 Project initiation Done
Hypotheses creation Done
Recruitment of language interns/research associates Done
Initiation of workflow and overall project structure Done
Aug - December 2021 Initiation of demographic study of Adivasi languages (focus languages: Ho and Santali) Done
Initial research of web content landscape Done
Identification of indicators Moved to Jan/Feb '22
Survey form prototype and localization Moved to Jan '22
Demo-day Moved to Feb '22
August 2021 Development of frameworks TBD Feb '22
Demographic study of Adivasi languages (focus languages: Ho and Santali) TBD Jan - Feb '22
Research the web content landscape TBD Feb '22
September 2021 Publication of mid-term report along with research outcomes (data, code, prototypes) Postponed to mid January 2022
Outline of planned deliverables
Duration Work details Status
October and November 2021 Further research (focus: scope of web monetization in Adivasi languages, particularly, Ho and Santali; consideration: digital accessibility and access to information by illiterate individuals) TBD Jan - Feb '22
Pilot study that makes use of Web Monetization using open practices TBD Jan - Feb '22
December 2021 Finalization of research report draft Feb - Mar '22
Submission of report draft for peer-review Feb - Mar '22
Publication of final research outcomes Post Mar '22



The key hypotheses that we started the project with:


  1. The Adivasi peoples in India by and large have the lowest access to financial, governance, institutional, linguistic and technical support for using their language for knowledge exchange.
  2. The long-lasting poverty of Adivasis has percolated into a no/low access to education in native languages and significantly restricted access to education in dominant languages.
  3. Development of native writing systems is believed to be a uniting factor for native speaker diaspora that are dispersed geographically.
  4. There is a growing trend for independence—from social oppression by neighbouring dominant communities—among many Adivasi peoples.
  5. Volunteer-led contribution for building technical tools, creating content and building capacity has seen only a handful of contributors which has led to their burnout.


  1. Most Adivasi people have significantly lower and inequitable access to education and economic opportunities. Societal oppression has led to much lower access for Adivasi women, including a lack of elementary education for Adivasi girl children.
  2. Poor public policy, lack of an adequate number of educators and infrastructure in the education system have resulted in a lower percentage of literacy among Adivasi people in native languages. Children are imposed to learn dominant and official languages only.
  3. Even though education programs and state reservations in the public job sector have contributed toward some Adivasi individuals receiving higher education, there is no structured system for individuals in different job sectors to contribute toward supporting the larger economic state of their own communities.


  1. The reservation in India for public jobs are squeezed for Adivasi individuals because of systemic corruption by public authorities and exploitation of provisions of reservation by non-Adivasi and caste-dominant groups.
  2. Adivasis whose work rights are violated do not have access to a fair justice system and hence they are often exploited at workplaces.


  1. Disagreements among scholars on linguistic factors—such as the preference of one writing system over the other for Adivasi languages—have been detrimental for content growth. Similar linguistic conflicts and the resulting hate posts on the internet have acted against emerging content creators.
  2. Official publications promoting non-native writing systems have dissuaded the widespread use of native writing systems. Examples include many public sites in India promoting Devanagari over Ol Chiki (native writing system for Santali) and Warang Citi (native writing system for Ho).
  3. Multiple writing systems being used for the same language beyond official use have an adverse impact on the language.


  1. There has been the widespread technical exclusion of Ho, Santali and many other Adivasi languages by big tech companies which have created technical barriers for content creators without any clarity on rationale.
  2. Santali language script was introduced in 2008 in Unicode version 5.1. Even after 12 years, still, Santali language is not available as an option to switch on in the major websites and apps like Google, Facebook, Twitter and Instagram or added as a language option. Ex. Santali language was added as an option on request on Poeditor.com before that people were not able to give contributions in the Santali language.
  3. Low online content sharing also yields in the availability of critical information in many Adivasi languages.
  4. Various tech companies support one primary script for their business, like Facebook, and YouTube where content advertising is only allowed in only one script, in some cases they allow but they prefer more to stick with one script.
  5. Entertainment-related content being mostly multimedia, searchability and discoverability of such content is lower as opposed to textual content.

Affordability and access

  1. Affordability is also aggravating a demand-supply issue in the context of subscription-based models such as dissemination of news and other content through subscriptions, video on demand and pay-per-click web content.
  2. As entertainment content is extremely popular as opposed to the content providing critical and comprehensive knowledge, content producers also prioritize entertainment content over other areas.
  3. While the creation of multimedia content can be extremely accessible for speakers of many oral languages and/or people with illiteracy and disability, the production cost for such content is extremely high. Content producers are not able to find viable sources of income to afford the know-how and time to create high-quality content.
  4. The overall content depth and diversity are lower because of lower participation.
  5. Lack of access to the internet plays a snowball effect as having free access to content in some of the languages also does not help grow the audience. Many potential viewers do not have access to the internet. Such a situation also motivates the content producers to promote or plan for paid content. There is low competition among producers and there is much lower remuneration for them.

Policy, Ethics and Human Subjects Research


The overall research will rely on public data including a large amount of open data. A part of the research would require personal interviews and the collection of minimal personal data. While we would use the ethical practices, guidelines and consent/content-release templates listed in OpenSpeaks for all non-public interviews, we would be mindful of not disrupting the volunteer time Wikimedians contribute in their respective wikis.

State of Web Content for Ho and Santali languages


We have studied the web content ecosystem of the Ho and Santali languages that we categorize as "low-resource languages". Even though the term "low-resource languages" is not strictly defined with a set of criteria, we are mostly including in this category all the indigenous, endangered, and other minoritized languages. The term "indigenous language" is self-explanatory and is a native/first language spoken by any indigenous speaker of a region. UNESCO defines an "endangered language" based on nine language vitality factors, from practical areas such as intergenerational language transmission to use of the language in new domains and media, to policies and documentation for the use of the language [13] When a language is either endangered or banned or if the speakers are persecuted, it is termed as a "minoritized language" [14] The UNESCO Atlas of the World's Languages in Danger lists 197 languages of India in the categories of Vulnerable, Definitely Endangered, Severely Endangered, Critically Endangered, and Extinct, and classifies Ho as a "vulnerable" or threatened language [15] The Ho language is spoken by the Ho (an eponym and endonym that literally means "human") people which has a population of 1.42 million as per the 2001 census of India [16] The Santali language (also Santhali, both exonyms) is spoken by the Santal people which have a population of nearly 7.6 million [16] Warang Citi and Ol Chiki are respectively the native writing systems of Ho and Santali. Santali is one of the 22 official languages as defined in the Eighth Schedule of the Indian Constitution whereas Ho is one of the 38 languages that are still being considered since 2017 to be added [17] The 1997 Ethnologue notes that Ho has a first-language-level literacy between 1 to 5%[16] and the same for Santali being 10-30$\%$ [18] Both ethnic groups are neighboring tribes and the respective diasporas also live in Bangladesh, Nepal, and Bhutan. The Santal people are one of the largest Adivasi communities both in India and in Bangladesh [19]

Considering the low literacy in both Ho and Santali as first languages, dissemination of educational, entertainment, and other types of content through the web and digital mediums are largely limited. In 2014, Schleiter cites of 120 popular movies and 350 music video albums that were made in Santali over a decade [20] Only one film in Ho received commercial screening certifications from the Indian government between April 2018 to March 2019. Similarly, five Santali movies received the certification between April 2017 to March 2020 [21] By January 2022, only seven Santali films were listed on the Internet Movie Database (IMDb) [22], the Amazon-operated web directory that allows contribution from film distributors and individual contributors. IMDb does not yet support Ho to be included in metadata as a primary language of production even though it is possible to search Ho-language movies/video titles [23] which practically does not make sense. Such a dearth of platform-level features raises the question of exclusion of marginalized peoples through the use of technology. Media files with file names typed in the Warang Citi writing system are not yet supported for file upload on on Wikimedia Commons [24], a sister project of Wikipedia that allows volunteers to upload media files and documents under a set of open licenses. The uploaded files can then be used on Wikipedia and other Wikipedia sister projects. We would however indicate that all these examples do not necessarily indicate systemic exclusionary practices. But despite such intentions, the practical implications of the features and bugs of different platform include discriminatory behaviors. Platforms that are governed by an array of governance models, from being run by corporations to civil society actors, inadvertently lack features (or contain bugs) to support low-resource languages. The governance models also do not always prioritize addressing such issues on a priority basis.



This publication is based on the interim outcomes of an ongoing ethnographic study spanning over a period of eight months (August 2021 to March 2022) in the Ho and Santali languages. Our initial research included literature reviews, primary research on web content that two of the co-authors are web content creators in these two languages, and the unpublished results of an older survey that we had conducted in 2020. We have looked into three kinds of media content through primary research: video (primarily on YouTube and Facebook), audio (mostly on music-streaming apps), and images (mostly on popular Facebook public pages/groups). We also checked most of the textual content portals such as online magazines, personal blogs, and other text-based web portals that are hosted on self-hosted websites or blogging platforms such as Blogger or on public posts on social media such as tweets or Facebook public groups and pages. During such literature review, we have taken into account both textual and multimedia content which do not always contain descriptions and other metadata in the Warang Citi and Ol Chiki writing system but contain content in either Ho or Santali. We would concur that the metadata is not sufficient enough to confirm if the content is always hosted, owned, managed or contributed by content creators who are native speakers of either Ho or Santali.

While Ho and Santali content creators currently do not use DLT-based web monetization, they have experience in using Google Adsense for monetizing content. Two of the co-authors who are content creators themselves shared their inputs based on a set of questions that formulated the basis of our observational data. These questions were set primarily around the gaps and opportunities that they see in technological, linguistic and audience behaviour. As physical movements were risky because of the high surge of COVID-19, we conducted remote interviews over phone calls and captured these inputs. Once documented, the questions and inputs helped design a structured process for further qualitative data collection through surveys.

We framed a set of questions for conducting surveys keeping in mind four key areas:

  • Social and environmental: These questions looked at both the social and ecological environments that are relevant to a content creator's work where their work is affected. We also asked in this section details on how they operate socially and if there is support or challenges that they receive from others around them.
  • Labor: Questions in this section looked further at livelihoods, affordability and access in a broader sense (to information, internet, volunteering or paid work online)
  • Educational and Linguistic: In this section, we mostly focused on more specific areas of access to education for a content creator, their interest level in promoting their native language (Ho or Santali), challenges pertaining to their users' access to the same which might affect the spread of the content that these content creators generate.
  • Technological: In this section, we delved deeper into the current technological hindrances and the content creators' know-how of available technology that is relevant and applicable to their everyday content-related work.

The selection of the respondents was done in an organic and iterative manner. We looked up the popular web portals, YouTube channels, and bloggers among others, and reached out to them over email or through phone calls by looking up publicly-listed numbers. We also converted most of the English version of the questionnaire into a Simple English-based survey form. This was because of two primary reasons: a) some of the respondents are not be conversant in reading and writing in Ho and Santali and hence would prefer an English form despite not having an advanced-level fluency in English, and b) the respondents might be more familiar with both technical and another general internet vocabulary in English because of frequent usage. We translated the Simple English version of the survey into both Ho and Santali in the respective native writing systems. For each of the two languages, we have reached out to about 30 content creators (individuals, collectives or organizations). This list is created mostly during the literature review. However, one of the authors of this paper who is a native speaker of Santali met with a group of content creators during a public event. Similarly, another author of this paper conducted a workshop related to creating audio recordings of pronunciations in the Ho language to be used for future speech synthesis projects. The generative interactions during these two events have helped us add more web content creators and expand the hypothesis. We are collecting data through online surveys into a cloud-based spreadsheet. The surveys started in the second week of January 2022 and will end by the end of the same month. We will then conduct both qualitative and quantitative analyses to create comprehensive observations. These observations, considering the questions that lie in the four aforementioned key areas, will indicate insights related to the same four areas. The co-authors also contributed to formulating a set of interim questions that are meant to be used by any decision-maker who is planning to implement a DLT-based web content monetization system. These questions collectively form a self-assessment framework, the "Before Web-Monetization Framework", which is meant to be used by DMs and is detailed in a section in this paper. We aim to share the overall status quo of the current web content ecosystem in both Ho and Santali. It is important to consider that Ho and Santali are placed in different spaces in an ecosystem-level spectrum that can be established based on the four aforementioned areas and Santali. Based on both the advantages and hindrances, we also aim to extrapolate the observations and narrate the broader state of the web content landscape of the indigenous languages of India. However, this broader observation of the current state of Indian indigenous languages would not provide specific details unlike Ho and Santali, and the observation could only be referred to at the same level of anecdotal evidence. On the other hand, the observations specific to Ho and Santali should be considered as primary and secondary qualitative data. We plan to select about 10 individuals (five from each language) for in-depth interviews based on the insights for further qualitative investigation. The interview questions for these interviews will be closely related to the set of questions included in the surveys and the analysis of the surveys which would indicate trends and patterns that we would observe after analyzing the surveys. The observations from the survey and more nuances captured through the interviews will form the basis of a set of recommendations. Two of the co-authors themselves being content creators in the two target languages, we have used the questionnaire to evaluate the software product development workflow and formulate a set of interim recommendations. These recommendations are added as a section in this paper. However, these recommendations should not be treated as final recommendations considering the small sample size and the generative nature of the process that was used to collect the data. Oftentimes, certain potential biases might have been recorded while creating them. Hence, we will be creating a final list of recommendations at the end of our study by the end of March 2022 containing amendments and corrections. The final list of recommendations would indicate the future of DLT-based content in both Ho and Santali languages and also in many other low-resource or indigenous languages. Both the Framework and the recommendation list will be published under public and a fairly open license such as the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0; https://creativecommons.org/licenses/by-sa/4.0/) resources for public use, which can also be forked and tweaked to fit into other low-resource languages from different geographical regions. We have a system to inform both the respondents and interviewees about remuneration for their participation in this study.

Labour and personal data considerations


We have kept the surveys anonymous (with an option to provide personal data for those we would like to receive remunerations) meaning the default fields do not collect all personal data such as the name, email and physical location of a respondent. The survey would require an average of 15 -20 minutes of time for each respondent, or even more considering factors such as their fluency in the written Ho, Santali or English among others. Some of the optional fields check information about the respondent's gender or the kind of place they live in (rural or urban). We have included a set of additional fields in the survey to offer a small amount of remuneration should the respondent choose to accept. This additional section does collect personal data that are required for the payment. All the personal data would be permanently deleted 30 days after the payment is made.


Broader observations from the research.

Awareness of sectoral concepts among content creators


In between the literature review and launching two qualitative surveys, we analyzed a survey that we had conducted in 2020 which we plan to publish along with the final outcome of this research. This analysis helped us learn about the level of understanding that multimedia content creators have about concepts such as consent-seeking, content rights including moral rights and copyright, and open licensing. The bilingual (English and Santali) surveys were conducted for a small group of 25 individuals. Three-forth of the respondents were speakers of Santali as their first language and were involved in the development of web content. our learning from this survey is as follows:

Linguistic distribution
  • 25 participants responded to the survey from five different countries. 17 of them spoke Santali as their first language.
  • One-third of the respondents represented either a collective, nonprofit, academic or another civil society organization while the remaining were a part of a different stakeholder group. They all mentioned using language digital activism independently in their personal capacity.
  • The respondents mentioned that they either create content on their own or promote existing content which includes audio, video and textual content. The mediums they generally would choose for publishing were blogging platforms, social media and websites. They mentioned content building, sharing and commenting as the regular activities.
  • 12 of the respondents mentioned that they document different kinds of content in languages other than Santali on different mediums. These respondents self-identified as native speakers of at least one indigenous language. At least four of them spoke a language which does not have a formally-recognized or standardized or widely used writing system.
  • 13 respondents in the survey shared that they are not fully aware of the process of seeking consent from their interviewees during cultural documentation.
  • Those who were aware of seeking consent shared that they follow three primary ways of consent-seeking: a) discussion with interviewees verbally, b) asking for consent and recording the same into the audio/video recording (maximum number of respondents who are aware of consent-seeking use this process), c) seeking consent in writing through online forms before the recording.
Understanding of content rights, copyright and licensing
  • Nearly half of the respondents confirmed that they are aware of the process of making cultural documentation in their own languages, but do not know well the process for licensing the content.
  • Most of the respondents also expressed the need for learning about moral rights, copyright and other forms of rights related to the content, and the Creative Commons licenses for publishing the documented media.

These survey results had influenced the creation of learning resources including a Santali-language chapter which contained newly coined terms around the mentioned topics above. Even though the survey predates our current project, the analysis was relevant to the research detailed in this paper, and we used the above findings for formulating the interim recommendations and the Framework detailed here.

Primary research


From our first set of primary searches, we have listed 18 Ho-language content creators of whom five actively post audio content, ten post videos, seven posts textual content and seven as images. Similarly, out of the 21 Santali-language content creators who we have listed, four primarily post audio content, five post videos, 12 posts textual content and seven as images. We also went through the YouTube videos that have more than one million unique views and noticed that 89$\%$ (166 out of 187) are music videos. This validates an earlier anecdote from Santali-language Wikipedia editor Ashwani R. Banjan Murmu who notes, "There is a large audience for Santali movies and music albums whereas video tutorials to teach any subject are a new concept and such videos are very limited in number."

Observational data from user testing


We collated observational inputs by interviewing two of the coauthors of this paper, interviewing Ho-language content users during a workshop, and interviewing many active content creators during a Santali-language cultural event. Analyzing these inputs suggest that the Warang Citi and Ol Chiki writing systems are not currently supported for Google Adsense [25] which restricts content creators to create ads in their own language using the native writing systems. As the lack of such an important feature does not allow them to place ads to target the right kind of audience, their overall earnings from advertisements get adversely affected. As a part of our early research, we conceptualized the use of a DLT-based payment system for placing ads. Based on a long list of hypotheses and prompts that were developed through a sprint, we concur that further investigation relating to the web content ecosystem in Ho and Santali could help identify critical factors relating to socioeconomic, educational, technological, geopolitical and other important demographics.

Critical factors affecting DLT-based web monetization


By going through the observational analysis of the current state of the web content both in the Ho and the Santali languages, and the scholarship around DLT-based systems, we have summarized some of the most relevant and critical factors. While we recognize that the DMs who are implementing any web content monetization system using DLT would decide primarily based on market trends and the business opportunities those trends signal for, we also have tried to elaborate on the user-related areas that would affect such decision-making. These factors connect to, but are not limited to, the broader discourse of labour, socioeconomic discriminations such as the brahminical caste supremacy of India, and environmental impact among others. We also observe that the current research relating to Web3 mostly discusses user experience, technological innovations and infrastructure, but not so much the impact on the labour sector, systemic and social oppression, and the environment.



The financial transactions by the use of DLT get validated by the "mining" of cryptocurrencies where a "miner" gets rewards in the form of a cryptocurrency. We apprehend that the current form of the mining process creates three distinct economic divisions: first, a capitalist group of elite investors who mostly gain from the mining work, second, highly-skilled miners (skilled labourers) who keep the value of cryptocurrencies like Bitcoin extremely high where the mining work can be compared with online gambling, and third, content creators and audience (working class in the lower economic strata) who only participate for meagre financial transactions. In our viewpoint, the extreme focus on anonymity by design not only forces everyone in such a system to treat each other as points of transaction but also does not have any form of humane attributes such as empathy or remorse. We also do not see any scope for all the individuals from the three aforementioned groups to make decisions based on labour systems, ethics or societal aspects. Instead, the design is very much indicative of a first-person shooter video game environment with a range of practical implications. Looking at the current capitalistic Web3 ecosystem, we also fear that this system might not only drive a dystopian version of many forms of human exploitation, especially of indigenous and many marginalized individuals but will further a version of decolonization. The design of DLT-based systems do not take into account the labour environments, lack of universal access to high-speed internet connectivity, affordability, and societal factors such as lower access to the internet for women, indigenous and minoritised groups, and people of disabilities. As elaborated earlier in this paper, the current lack of a range of essential resources for different indigenous groups needs to be examined in detail to check if DLT-based solutions for web monetization indicate an aggravation of many forms of labour disparities.

Environmental impact


All forms of cryptocurrencies that form the core of the DLT-based web content monetization involve a public transaction record or blockchain that is stored securely in a decentralized and peer-to-peer network which cannot be altered. A robust level of algorithm helps with such a strong security and it also involves crypto mining to ensure that there is no double transaction or any of the transactions are tampered with considering that there is no central authority like a bank in a conventional bank wire. Some recent examples of the use of cryptocurrency include the sale of digital arts using Non-fungible tokens (NFT) [26], a crypto wallet called Brave Wallet built into the Brave browser[27], and use of the open-standard Interledger Protocol [28] to make payments for web content. However, the extremely high level of security and decentralization of a trustworthy financial transaction system using blockchain comes with a heavy cost to the environment. While Vranken estimates the energy consumption for Bitcoin, one of the earliest cryptocurrencies that surfaced in 2008, to be 100–500 MW [29], further research affirms that many less-documented cryptocurrencies that are actively used together require close to the same level of energy [30]. When the use of cryptocurrencies are taken into account in the context of India where 75% of the power generation were done until the end of 2020 from thermal power [31], it is extremely crucial to study further the direct impact on the environment and indigenous peoples. The geographical and societal environments where majority of the Ho and Santali communities reside include primarily the eastern and central regions of India. The region, mostly the hilly parts of Indian states of Chhattisgarh, Jharkhand, Odisha and West Bengal, which have been homes to these two communities along with many other Adivasi communities are also the regions where the largest number of mineral mining happens in India [32]. We see technologies that are created with an inherent design for extremely high energy demand, without any real explanation for the discontinuation of existing technology with low energy consumption, do not involve DMs consciously looking at potential environmental damages. There could be a direct surge in energy demand in India with the rise of the DLT industry which can potentially increase thermal power production. We apprehend that the entire pipeline of thermal plants starts from the mining of coal where the manual labour workforce includes mostly Adivasi workers, to the use of depleting natural resources including water and coal and the resulting damage to the forest cover, and the resulting air and other kinds of pollution collectively, would grow exponentially to meet the high energy demand. The impact can be catastrophic both to the environment and multiple indigenous groups living close to the mining sites.

Systemic and social oppression


While the presence of a high level of natural resources, particularly coal and iron ores, has been instrumental for the political systems benefiting the ruling dominant caste groups other than Adivasi members in different political parties, it has not contributed well to improving the economic state of the Adivasi peoples [33]. Prior research also suggests that the lack of basic amenities, such as education, electricity, physical infrastructure for commuting and online infrastructure for communicating, have restricted the Adivasi communities from availing a decent level of access to information [34].

Self-assessment framework



Conclusion and recommendations






  1. "Read the Principles". Design Justice Network. Retrieved 2022-01-09. 
  2. Costanza-Chock, Sasha (2020). Design justice: community-led practices to build the world we need. Information policy. Cambridge, MA: The MIT Press. ISBN 978-0-262-04345-8. 
  3. Gebru, Timnit (2020). "Race and gender". The Oxford handbook of ethics of AI: 251–269. 
  4. Edelman, Gilad (2021-11-29). "What Is Web3, Anyway?". Wired. ISSN 1059-1028. Retrieved 2022-01-08. 
  5. Charlton, James I. (2000-08-30). Nothing About Us Without Us: Disability Oppression and Empowerment. University of California Press. ISBN 978-0-520-22481-0. 
  6. Malcolm, Jeremy (2008). Multi-stakeholder governance and the Internet Governance Forum (1st paperback ed ed.). Perth: Terminus Press. ISBN 978-0-9805084-0-6. 
  7. Collins, Patricia Hill (2002-06-01). Black Feminist Thought: Knowledge, Consciousness, and the Politics of Empowerment. Routledge. ISBN 978-1-135-96013-1. 
  8. Costanza-Chock, Sasha (2020). Design justice: community-led practices to build the world we need. Information policy. Cambridge, MA: The MIT Press. ISBN 978-0-262-04345-8. 
  9. Salah, Khaled; Rehman, M. Habib Ur; Nizamuddin, Nishara; Al-Fuqaha, Ala (2019). "Blockchain for AI: Review and Open Research Challenges". IEEE Access 7: 10127–10149. ISSN 2169-3536. doi:10.1109/ACCESS.2018.2890507. 
  10. Marlinspike, Moxie (2022-01-07). "My first impressions of web3". Moxie Marlinspike. Retrieved 2022-01-15. 
  11. Gallersdörfer, Ulrich; Klaaßen, Lena; Stoll, Christian (2020-09-16). "Energy Consumption of Cryptocurrencies Beyond Bitcoin". Joule 4 (9): 1843–1846. ISSN 2542-4351. doi:10.1016/j.joule.2020.07.013. Retrieved 2022-01-20. 
  12. "World Directory of Minorities and Indigenous Peoples: Adivasis". Minority Rights Group. 2015-06-19. Retrieved 2022-01-20. 
  13. Language Vitality and Endangerment. Paris: UNESCO Ad Hoc Expert Group on Endangered Languages. 2003-03-10. p. 27. 
  14. Hornsby, Michael; Agarin, Timofey (2012). "The end of minority languages-Europe's regional languages in perspective". JEMIE 11: 88. 
  15. Moseley, Christopher (2010). Atlas of the World’s Languages in Danger. Paris: UNESCO Publishing. Retrieved 2022-01-19. 
  16. a b c Statement 1: Abstract of speakers' strength of languages and mother tongues - 2011. New Delhi: Office of the Registrar General & Census Commissioner, India. 2011. 
  17. Constitutional provisions relating to Eighth Schedule (PDF), Ministry of Home Affairs, Government of India, 2004 
  18. M. Eberhard, David; F. Simons, Gary; D. Fennig, Charles (2021). "Ethnologue: Languages of the World (Sat)". Ethnologue. Retrieved 2022-01-14. 
  19. Cavallaro, Francesco; Rahman, Tania (2009). "The Santals of Bangladesh". Linguistics Journal 4. 
  20. Schleiter, Markus (2014). "VCD crossovers: Cultural practice, ideas of belonging, and Santali popular movies". Asian Ethnology 73 (1/2): 181. 
  21. "Region /Language-wise Certified Indian Feature Films (Digital) from 1-4-2015 to 31-3-2021" (Text). Open Government Data (OGD) Platform India. 2021-08-17. Retrieved 2022-01-20. 
  22. "IMDb: Santali (Sorted by Popularity Ascending)". Internet Movie Database (Online directory). 
  23. "IMDb: Ho (Sorted by Popularity Ascending)" (Online directory). Internet Movie Database. 2022-01-20. 
  24. Biswajeet3 (2021-12-09). "T297351 Warang Citi (Ho-language writing system) characters not detected on Wikimedia Commons". Wikimedia Phabricator. Retrieved 2022-01-20. 
  25. "Languages Google publisher products support - Google AdSense Help". 2022-01-21. Retrieved 2022-01-21. 
  26. "Non-fungible tokens (NFT)". ethereum.org. Retrieved 2022-01-21. 
  27. Clark, Mitchell (2021-11-16). "Brave built its own crypto wallet into its browser". The Verge. Retrieved 2022-01-21. 
  28. Thomas, Stefan; Schwartz, Evan; Hope-Bailie, Adrian (2017-01-09). "The Interledger Protocol". IETF. Retrieved 2022-01-12. 
  29. Vranken, Harald (2017-10-01). "Sustainability of bitcoin and blockchains". Current Opinion in Environmental Sustainability. Sustainability governance 28: 1–9. ISSN 1877-3435. doi:10.1016/j.cosust.2017.04.011. Retrieved 2022-01-21. 
  30. Gallersdörfer, Ulrich; Klaaßen, Lena; Stoll, Christian (2020-09-16). "Energy Consumption of Cryptocurrencies Beyond Bitcoin". Joule 4 (9): 1843–1846. ISSN 2542-4351. doi:10.1016/j.joule.2020.07.013. Retrieved 2022-01-20. 
  31. "Ministry of Coal, GOI". 2022-01-12. Retrieved 2022-01-12. 
  32. Padel, Felix; Das, Samarendra (2010). Out of this earth: East India Adivasis and the aluminium cartel. Orient Blackswan New Delhi. 
  33. Adhikari, Anindita; Chhotray, Vasudha (2020). "The Political Construction of Extractive Regimes in Two Newly Created Indian States: A Comparative Analysis of Jharkhand and Chhattisgarh". Development and Change 51 (3): 843–873. ISSN 1467-7660. doi:10.1111/dech.12583. Retrieved 2022-01-20. 
  34. SAHA, ANOOP (2012). "Cellphones as a Tool for Democracy: The Example of CGNet Swara". Economic and Political Weekly 47 (15): 23–26. ISSN 0012-9976. JSTOR 23214943. Retrieved 2022-01-20.