Research:Content Creation on Eastern Punjabi Wikipedia

From Meta, a Wikimedia project coordination wiki
Duration:  2020-January – 2021-February
This page documents a completed research project.

This is a short study on the nature of content creation related to Punjab on Eastern Punjabi Wikipedia, its challenges and opportunities, and observations and potential strategies to address the same. The report has been authored by Satpal Singh with editorial oversight and support by Puthiya Purayil Sneha, and external review by Sumandro Chattapadhyay. This is part of a series of short-term studies undertaken by the CIS-A2K team in 2019–2020.


The objective of this study is to understand the challenges for content creation related to Punjab that exists on Eastern Punjabi Wikipedia. There are articles about Punjabi language and culture on Punjabi Wikipedia but there is a need for a better understanding of the nature of this content from the perspective of the readers’ interests, coverage of topics, and quality. A large community of interested Punjabi Wikimedians have been actively working over several years to introduce Wikimedia and related projects to people across the world, including those from their own community. An important part of achieving this goal is to contribute to and build more diverse and better quality content about Punjab on Wikipedia. This short study is therefore an attempt to analyse the nature of existing content, challenges in content creation/curation and outreach, and some observations and strategies to address the same.

Eastern Punjabi Wikipedia[edit]

Wikipedia is a multilingual online encyclopedia. It is available in around 290+ languages. There are two Punjabi Wikipedia editions, which are Eastern Punjabi Wikipedia and Western Punjabi Wikipedia. Eastern Punjabi Wikipedia is in Gurmukhi script and Western Punjabi Wikipedia is in Shahmukhi script. This study focuses on the Eastern Punjabi Wikipedia. The Eastern edition domain came into existence on 3 June 2002, but the first three articles were only written in August 2004. There was not much contribution made during the next six years. In July 2012, it had reached 2,400 articles. Then a group of people, largely students from Punjabi University, Patiala started contributing actively on Punjabi Wikipedia, as a result of which it became an active Wikipedia in 2013, and has stayed so until date. There are currently 35,351 articles on the Gurmukhi Punjabi Wikipedia, with a number of registered users on Punjabi Wikipedia at 36,348.[1]

One group of people has been proactively involved in Punjabi Wikipedia for a long time, which is the Punjabi Wikimedians User Group. Apart from this, a number of people from different parts of the world also contribute to Wikipedia. Punjabi Wikimedians got affiliation as a User Group from Wikimedia Foundation in November 2015. Punjabi Wikimedians was the first affiliated user group from India and has been involved in several activities and initiatives undertaken towards content creation. They organized WikiConference India in 2016 at Chandigarh and their members have participated in various events and conferences. They have also collaborated with other institutions in order to encourage content creation on Punjabi Wikipedia, one example is the collaboration with Punjabi Sahitya Academy. Apart from Wikipedia, this user group is also active on Punjabi Wikisource, Punjabi Wiktionary, Wikidata and Wikimedia Commons. The first meeting of the Punjabi Wiki community was organized in Patiala on 1 February 2015. After that, the community conducted various monthly meetups in different parts of Punjab. People from the community also joined various training programs and events in different parts of India and participated in various conferences in other countries.

Research Objectives and Method[edit]

This study analyses various aspects of how content related to Punjab is created on Eastern Punjabi Wikipedia. This analysis would help in understanding the gap between what kind of content presently exists and what is needed, from the perspective of Punjabi language contributors and users. The objective of this study is to understand how much content related to Punjab exists on this Wikipedia at present; what is the nature of this content, what are challenges for content creation and possible strategies to address the same. There is a broader understanding that while content is being created proactively, there is still a need to analyse its quality and prevalent gaps if any, which would encourage more contributors and readers to actively engage with Punjabi Wikipedia.

The method for this study consisted of an analysis of the existing content on Eastern Punjabi Wikipedia, and conversations with selected Wikimedians on their assessment of content on topics related to Punjab. The main topics for this study were articles related to the Punjab region, including culture, literature, and politics. To understand if there are specific challenges to the creation of content on these topics, interviews with a few selected long-time contributors and administrators were conducted, with an emphasis on aspects such as sourcing Punjabi language material, finding references, digitization, tagging etc.

The objective of this study was also to understand where these conversations (undertaken as part of the study) may offer strategies to address knowledge gaps in specific areas of work. Questions were prepared and a total of five interviews of Punjabi Wiki community members were conducted. The people interviewed were chosen on the basis of their involvement and experience of working in the community.

Observations and Analysis[edit]

There are an estimated 33 million Eastern Punjabi speakers in the world[2], and it is a widely spoken language in India, especially in Punjab state. Over 70% of people have access to the internet in Punjab on the phone[3]. The main objective of this study was to understand the nature of existing content on Punjabi Wikipedia, and various challenges in content creation, coverage of topics and quality. The following are some of the main observations and learnings from the study.

Challenges with Lack of Existing Content[edit]

The conversations with selected Wikimedia contributors and users offered an insight into what Punjabi readers and online contributors think about the content available on the internet in the Punjabi language, and how Punjabi Wikipedia is impactful in this scenario, especially in addressing any gaps in this area. It was found during discussions with interviewees that there is less number of Punjabi language websites in the field of language, literature, politics, and general knowledge. Most of the websites are in the English language. PunjabiPedia and Punjabi Wikipedia are encyclopedic websites which are providing knowledge in Gurmukhi script. Apart from this, websites like SikhiWiki are providing knowledge in the Roman script and Punjabi newspaper websites are providing their news updates in Gurmukhi script. So, Punjabi Wikipedia is one of the few available sites that offer information on a variety of topics in the local language. As a result, it may have good viewership, but at the same time, there is also the additional problem of not having good or reliable online sources or references.

Another important point mentioned by people interviewed was that while there are 30,000+ articles on Punjabi Wikipedia and they are categorised across different topics, there is a lack of content about Punjab itself. It was suggested that, therefore, this should be an area of priority for the community to work on. Even the most viewed articles of Punjabi Wikipedia do not meet the good article criteria of Wikipedia. For example, the article of Harmandir Sahib on Punjabi Wikipedia has not been written according to good article criteria, as it too has no category and it is without proper sections.[4] There are not many references in most of the important articles. Another example is the article of the tenth Sikh Guru, Guru Gobind Singh, which has only 5 references.[5] Apart from this, articles about cities and villages of Punjab are mostly stub articles. The total number of villages in Punjab is about twelve thousand and a good number of the articles about these villages are available on Punjabi Wikipedia. They are too small and the need is to expand those articles. There are about 7,000+ articles in the stub category. Such articles, therefore, need more work and improvement in terms of quality.

Methods of Creating New Content[edit]

Most of the content on Punjabi Wikipedia is about other countries or regions apart from Punjab or India. One of the reasons for this is that most of the editors are doing content translation, from existing content on English or other regional languages Wikipedias into Punjabi Wikipedia. In order to fill the content gap about Punjab, there should be content creation specifically on topics related to the state. Content translation tools, while helpful, have also contributed to the fact that people prefer translation and they use Google Translate in the content tool. The tool itself is accurate and works fine with Punjabi, but the issue is that most of the people are doing only translation, as it is the easiest way to contribute. Apart from the above, it was also noted by the interviewees that edit-a-thons about other countries or cultures, while useful, are not beneficial in the immediate context. Nitesh Gill[6] observes that there should be more discussion among the community members about upcoming events or edit-a-thons. She says:

“Sometimes two or more events are going on in the same time period and the same contributors take part in those activities. It should be better, that with the cooperation we can have one event at one time. We will grow in a better way if we do something about this.”

There is less viewership of those articles which are related to other countries or cultures apart from India, for example, the article of Constantine Peter Cavafy. But articles of importance from the perspective of region and culture are not edited for a long time, such as the article about Punjabi story writer Maninder Kang, which is smaller than the article of Constantine Peter Cavafy. For example, Stalinjeet Brar[7] notes that:

“The article of former chief minister of Punjab, Prakash Singh Badal[8] on Punjabi Wikipedia is a relevant article. He is a remarkable personality of Punjab in the history of politics. He left his position in 2017 but the article still shows that he is the chief minister of Punjab. This is our major mistake. We have to work on this aspect.”

He also added that the statistics of cricketers like Virat Kohli are not updated. In this regard, we can integrate Wikidata with Wikipedia articles so one change on Wikidata can provide automatically updated data on Punjabi articles.

He also added that an assessment of existing events and initiatives, such as Project Tiger would be useful to understand the challenges and opportunities for content creation on Indian language Wikipedias The first Project Tiger edit-a-thon happened from 1 March to 31 May 2018 and the second Project Tiger event was organised from 10 October 2019 to 11 January 2020, which was named “Project Tiger 2.0”. Project Tiger coordinators can assess the number of views of those articles which were created during the event and therefore arrive at a potential strategy for our target audience. It is, therefore, useful to undertake such an analysis and evaluation at the end of big events.

Strategies for New Content Creation[edit]

It was observed that to fill this content gap about Punjab and its culture on Punjabi Wikipedia, the community needs to approach this topic accordingly. It was suggested by every interviewee that we have to organise an edit-a-thon to edit top viewed articles. To maintain continuity in this approach, community members should cooperate with each other. Charan Gill[9] noted that we should engage students and professors in different subjects to collaborate and work on these areas with editors. This will be helpful in producing good quality articles. The community needs to engage experts from different areas or fields. Manavpreet Kaur[10] suggests that we should focus on good article criteria when we go to a college or institution to teach students how to edit Wikipedia. For example, she notes that there were very few articles about forensic science when she joined Punjabi Wikipedia and existing content was two or three-line articles. She tried to fill this gap and as a professor of forensic science, she engaged her students to edit Wikipedia content related to forensic science. So, we can have this kind of approach to fill the content gap on Punjabi Wikipedia. We should encourage colleges in Punjab and teachers to participate in this free knowledge movement. Wikimedia projects are platforms for the Punjabi language community to provide knowledge in their own language. Manavpreet and Nitesh also note that there is less number of women participants in this movement. To fill the gender gap we should also focus on engaging women contributors. According to Manav, the Wikidata Game was interesting for her and noted that especially for new editors these types of games and editing techniques are so valuable in order to engage the younger generations of editors with this movement.

Benipal Hardarshan[11] shared his view that to engage the new generation with Punjabi Wikipedia or the broader free knowledge movement we should also work on basic articles related to the technological world. For example, articles on computers and other devices, mobile games etc. He also notes that while Punjabi Wikipedia has articles on advanced topics, most of them are translated, and not new content. But it does not have basic articles of good quality to offer an appropriate understanding of the topic. There is also a problem of translating technical vocabulary into the Punjabi language, which can be addressed by engaging experts and scholars as part of edit-a-thons and other initiatives for content creation. This way, the energy and interests of volunteers may also be harnessed with the right methods. Key members from the community can assign articles to newcomers, so that will help in further content creation. Nitesh shared her observation that

“I was a beginner type volunteer at one time and later on with experience I have organised various events within my community. So, we should encourage our team members or volunteers. They can be good organisers or leaders of this movement.”

To engage students, there should be more syllabus oriented content on Punjabi Wikipedia. To bring a change in the structure, Stalinjeet suggested that we should not blindly follow policies of other languages like English and French or policies of the Wikimedia Foundation in India. We need to rethink these policies in the context of the needs of the local languages. Prioritization of edit-a-thons is also necessary, including coordination and working collaboratively on when to participate in which event.

In addition to the above, it has been noticed that it is difficult for communities with a small number of members to contribute collaboratively, so it is imperative to slowly increase the number of contributors as well. There are various Wikimedia projects that volunteers can join or they can contribute to any project according to their interest. They are not limited only to Punjabi Wikipedia. A good number of people are also active on Punjabi Wikisource as well. The need of the hour, therefore, is to engage new people with Wikipedia or with this movement to make changes to the modes of access to knowledge in Indian languages. Experienced Wikimedians should share their learnings, apart from the training that is required for advanced editing.


As illustrated by observations above, content creation on Eastern Punjabi Wikipedia faces a specific set of challenges. The people interviewed as part of this study have offered various suggestions on what can be done to address these limitations and improve the quality of content. Punjabi Wikipedia is an important source of information and knowledge for Punjabi internet readers due to the lack of websites providing content in the language. So, the responsibility of its reliability to provide such content in a sustainable manner is even greater. Work on creating more content on these platforms needs to be undertaken after understanding the response and ways of engagement of the readers. People today want to read less and learn more. Punjabi Wikipedia articles, therefore, need to be informative and include as many references as possible. A crucial gap here is also the lack of information on how to contribute to Punjabi Wikipedia in a productive and easy way. Good documentation of help pages and more frequent training would help in addressing this shortcoming as well.

In conclusion, the main strategy to address these knowledge gaps, as illustrated by the learnings from this study, is that we should update existing articles on Punjabi Wikipedia at priority, with a focus on expanding top viewed stub articles. A focus on the quality of content is therefore more important than quantity. In addition to this, knowledge-sharing by experienced Wikipedians, diverse modes of training and engaging new contributors, and working on strategies for sustainable content creation would go a long way in addressing the content gaps on Punjabi Wikipedia.


  1. Statistics as of 04 March 2021, 09:59 PM
  2. “Punjabi, Eastern,” Ethnologue, accessed February 9, 2021,
  3. Roy, C. Vijay. “In Punjab, over 70% people access the internet on the phone” The Tribune India, May 2, 2019. Accessed February 9, 2021
  4. As of 04 March 2021 09:59 AM
  5. As of 04 March 2021 09:59 AM
  6. Nitesh Gill is a research scholar from Moga, Punjab and pursuing her Ph.D from University of Delhi in Punjabi literature. She has been contributing on Punjabi Wikipedia from 2015. Her remarkable contribution on Punjabi Wikipedia is that she has completed 1000WikiDays challenge, which means one article every day. See: User:Nitesh_Gill
  7. Stalinjeet Brar is from Faridkot and has been contributing on Wikimedia projects since August 2014. He is doing his Ph.D in Punjabi language and his research topic is also a comparative study of Punjabi Wikipedia and PunjabiPedia. See: User:Stalinjeet_Brar
  8. As of 04 March 2021, 09:59 AM
  9. Charan Gill is an experienced volunteer, aged 76 years old. He has been contributing on Wikimedia projects since 2008 He is the top contributor from Punjabi Wiki Community with more than 59,000 edits on Eastern Punjabi Wikipedia. See: User:Charan Gill
  10. Manavpreet Kaur is from the Forensic science field and completed her PhD in the same subject also. She engaged with Punjabi Wikipedia in 2014 and also as a volunteer she has completed the 100WikiDays challenge and has done various forms of outreach for the community. See: User:Manvapreet Kaur
  11. As a student of secondary school, Benipal hardarshan is one of the youngest Wikimedians from Punjabi Wiki community. He made his first edit in 2014 but is actively contributing from 2016. He is an active administrator on Punjabi WIkisource. See: User:Benipal hardarshan