Jump to content

Research:Vital Knowledge Interviews/Telugu

From Meta, a Wikimedia project coordination wiki

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.


This page summarizes the participatory interview on Vital Knowledge conducted with board members of Telugu Wikimedians User Group from July to September 2025.

Definition of Vital Knowledge

[edit]

The Telugu community views Wikipedia as a crucial platform for providing factual, reference-based knowledge that is often locked away in physical libraries or behind copyright restrictions. Vital knowledge is defined as **factual knowledge that is freely accessible**, serving as a corrective to unverified information found on blogs and social media. Its scope is broad, encompassing everything from Telugu literature and local history to global subjects like medicine, science, world geography, and technology.

What Makes Knowledge Vital?

[edit]

The community uses a multi-layered, implicit prioritization framework (an "unwritten rule"):

  • Geographical Hierarchy:
  1. Telugu States (Andhra Pradesh & Telangana): Highest priority, including detailed coverage of all ~27,000 villages, local history, culture, and prominent figures.
  2. India: Secondary priority, with focus on Telugu figures within pan-Indian contexts.
  3. The World: Global topics like science, technology, and world history, which are vital but lower in immediate priority.
  • Additional Factors:
  1. Urgency: Need to grow content and recruit more volunteers to handle the vast amount of knowledge requiring addition.
  2. Popularity & Relevance: Reader demand through search trends and cultural events guides creation, balanced with culturally significant but less popular topics.

Usefulness for Shared Goals

[edit]

Defining vital knowledge helps the community prioritize content creation structurally and balance popular demand with culturally significant topics. However, effectiveness depends on solving two major challenges:

  • Volunteer Shortage: Lack of active editors needed to create and maintain content at desired volume and quality.
  • Lack of Advanced Tools: Limited ability to scale efforts due to insufficient specialized tools for editing, translation, and content gap analysis.

Tools and Missing Support

[edit]

The community has a clear vision for needed tools:

Translation Enhancement

[edit]
  • Improved machine translation for Telugu that automatically corrects recurring errors and ensures natural sound.
    • The option of MINT translation was shared with the team (as it supports Telugu), and it would be beneficial to understand how it performs compared to Google Translate. Additionally, it would be interesting to assess how it automatically corrects known/recurring errors, similar to how OCR algorithms perform automatic post-correction.
  • A notification system that alerts an editor when a source article they translated has been updated.
    • The Language and Product Localization team has explored this feature in T287093: Notify about new content available for articles recently translated by the user.
      • This proposed system would automatically alert users when new substantial content sections are added to source articles they previously translated, helping them keep translations up-to-date.
    • Current workaround: Editors can manually add source articles to their watchlist for notifications.
  • It should also be easier to translate/adapt user scripts from other wikis (like English Wikipedia) into other languages without necessarily needing to understand the technical basis
    • I've been investigating how to make popular English Wikipedia user scripts more accessible to non-technical editors in other language communities.
    • The technical foundation exists through Message Bundles, which can handle UI translation without code changes, but requires initial developer setup.
    • Analysis of widely-used gadgets shows strong candidates for adaptation, particularly default-enabled tools like Navigation Popups and Reference Tooltips (7M+ users) and popular opt-in tools like Twinkle and HotCat (10k-50k users).
    • Next steps include analyzing 1-2 high-impact gadgets to confirm translatability and creating a "translation cookbook" with video tutorials for the complete adaptation process.

Content Creation & Improvement

[edit]
  • A tool that suggests related articles to create after publishing a new article (e.g., to fill red links).
    • SuggestBot used to do this for all the articles it suggested to subscribers. It is important to note that back when Morten Warncke-Wang experimented with those signals, one of the key takeaways was that there will always be a case where the suggestions don't fit. For example, an editor pointed out that it was pointless to ask them to add more content to an article about a local politician, because there was simply not more to write. High-quality articles are of a certain length, but not all topics can be written about in that manner. Being able to report back when a signal is not helpful for a given article/revision could help, so the system doesn't keep repeating the same suggestions all the time. I have followed up with one of the creators to see what it would be needed to implement in other languages, like Telugu.
  • Article Quality Assessment: A tool to evaluate article quality against known standards and a local list of common typos.
  • Individual Contribution Evaluation: A tool for editors to self-assess their work and track improvement over time.

Gap Analysis & Prioritization

[edit]

Gap Analysis & Prioritization: A unified tool that consolidates various pre-existing lists (e.g., "Vital Articles" from English Wikipedia, lists of articles missing in Telugu but present in other Indian languages, lists of prominent Indian scientists/award winners or other categories who do not have an article in Telugu) to provide a single, actionable list of content gaps.

  • The team has already created a query to get lists of articles missing in Telugu but present in other Indian languages, and I will try to make a list of existing workflows that could bring us closer to this wish.