Jump to content

Research:Vital Knowledge Operationalization

From Meta, a Wikimedia project coordination wiki
This page documents a completed research project.


This project operationalized prior Vital Knowledge research with Punjabi, Telugu, Uganda, and Singaporean Wikimedia communities. Through co-creation with four communities, we developed reusable workflows for generating vital knowledge lists.

Background

[edit]

Previous research revealed that communities define vital knowledge differently based on local contexts, but all face challenges in systematically identifying and prioritizing content gaps. This project tested whether communities could co-create operational workflows using their preferred data sources (pageviews, education curricula, external encyclopedias) and validation processes (community consultations) to generate prioritized article lists.

Methods

[edit]

We worked collaboratively with each community from October–December 2025 to:

  1. Co-design workflows tailored to community definitions of what is vital
  2. Implement data gathering using community-preferred sources
  3. Generate draft prioritized lists through hybrid automated/manual approaches
  4. Validate through community endorsement processes

Results

[edit]

We successfully co-created vital knowledge workflows with four communities, resulting in draft lists and reusable methods. We demonstrated that while each community has a different definition of what is vital, there are consistent data points that are useful across all groups:

  • Pageviews and reader traffic patterns
  • Identifying gaps through comparisons with other online encyclopedias (for example, Infopedia and Punjabipedia)
  • External reader interest signals, such as news trends and Google searches

These commonalities confirm that a reusable, adaptable model for creating vital knowledge lists is possible, even if no single workflow is fully automated or scalable yet. The process remains community-dependent and requires significant manual effort—especially in mapping unwritten knowledge and adapting to regional variations within colonial languages—but we have identified clear pathways to better support communities moving forward.

Key Insights

[edit]
  • Vital ≠ Popular: Telugu's case demonstrated zero overlap between their vital knowledge list and the missing top-results in Telugu Google searches, highlighting the tension between reader demand (contemporary pop culture) and encyclopedic importance (foundational, academic topics)
  • Hybrid approaches are necessary: No single data signal (pageviews, searches, curricula) proved sufficient on its own
  • Language ≠ Culture: Single-language wikis often span multiple cultural contexts requiring regional adaptation
  • Consistent useful data points emerged: Despite different definitions of vital, all communities found pageviews, external encyclopedia comparisons, and reader interest signals valuable (more about this in the Identified Data & Tooling Needs section)

Operational Workflow Model

[edit]

Based on our co-creation with four communities, we identified a reusable framework for vital knowledge list creation:

Phase 1: Foundation & Partnership

[edit]
  • Community engagement and definition of what is vital for them
  • Identification of preferred data sources and validation processes

Phase 2: Data Gathering & Processing

[edit]
  • Automated data collection (pageviews, cross-wiki comparisons)
  • Manual enhancement (community consultations, mapping unwritten knowledge)

Phase 3: List Creation & Prioritization

[edit]
  • Gap analysis and triangulation across multiple data sources
  • Priority setting balancing community definitions of importance

Phase 4: Community Endorsement

[edit]
  • Structured community feedback and consultation
  • Formal sign-off and ratification of final lists

Phase 5: Activation & Scaling

[edit]
  • Conversion of lists into actionable tasks
  • Model documentation and adaptation for other communities

Challenges & Opportunities

[edit]

Current Limitations

[edit]
  • Tool usability: Pageview data is not easily accessible for non-technical community members
  • Colonial language complexity: Defining what is vital in widely spoken colonial languages (e.g., Spanish, English) is highly dependent on region
  • Mapping unwritten knowledge: Still requires significant manual effort as automated signals (Google searches) surface popular rather than vital topics
  • Workflows are not yet fully automated or scalable: Each requires manual work by communities

Identified Data & Tooling Needs

[edit]

Gap Discovery & Analysis

[edit]
  • Easier methods to scan and identify gaps in Wikipedia content
  • Access to structured data from external platforms (e.g., search gap data or ways to compare other online encyclopedias against Wikipedia)
  • Better ways for identifying missing interlanguage links

Reader Demand & Interest Signals

[edit]
  • Easier access to pageview data (to better assess immediate reader demand)
  • Multilingual data on current events, breaking news, or other ways to monitor reader interest

Task Creation & Contribution Support

[edit]
  • Improved identification and generation of micro-tasks from gap lists
  • Better tooling to turn prioritized lists into actionable newcomer-friendly tasks

Impact Measurement & Feedback

[edit]
  • Better ways to measure success (e.g., closed gaps, improved article quality over time)
  • Clear indicators that show whether work on vital knowledge lists is reducing content gaps

Tool Awareness & Ecosystem Navigation

[edit]
  • Guidance on how to easily raise missing tool needs (e.g., for identifying missing interlanguage links)
  • Better visibility into current tool development and ongoing work (e.g., tools for correcting typos or improving content quality)

Conclusion

[edit]

We successfully co-created vital knowledge workflows with four communities, generating draft lists and reusable methods. While each community has a different definition of what is vital for them, we identified consistent data points useful across all groups, confirming that a reusable, adaptable model for creating vital knowledge lists is possible.

The groundwork is now laid for activation. The next phase of this collaborative process could focus on:

  • Activating communities to create content based on these validated lists
  • Experimenting with interventions to support quality contributions
  • Understanding whether vital knowledge lists can make content work less overwhelming and more engaging for small communities
  • Improving access to the data points that were useful across communities

See also

[edit]
[edit]

Diff Blog Posts

[edit]