Jump to content

Research:AI-BRIDGES

From Meta, a Wikimedia project coordination wiki
Created
01 September 2025 (UTC)
Contact
Dr. Shani Evenstein Sigalov
Collaborators

Core partners: WMDE, WMUK, WMBR, PleiAs. Many other organizations, institutions, affiliates and individuals.

For more details see: ai-bridges.org
Duration:  2025-09 – 2027-08

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.


About

[edit]

AI-BRIDGES (AI-Driven Bridging of Resources and Integration of Data Governance in Educational & Cultural Heritage Systems) is a research-driven, collaborative project exploring how institutional data, open knowledge infrastructures, and Generative AI (GenAI) can be better connected in ways that are sustainable, responsible, and globally inclusive.

The project is hosted at the Digital Humanities Research Hub, School of Advanced Study, University of London. It is funded by the European Commission through a Marie Skłodowska-Curie Postdoctoral Research Fellowship, spanning 2 years, from September 2025 to August 2027.

AI-BRIDGES is designed as a time-bound research project with long-term ambitions. Its research, collaborations and outputs are developed so they can continue to serve institutions, communities and the commons beyond the lifetime of the grant.

The Challenge

[edit]

Across the knowledge ecosystem, institutions, open knowledge infrastructures, and AI systems still operate largely in silos. This makes it difficult to share high-quality data openly, reuse it at scale, and ensure it remains visible and meaningful in an AI-driven world.

  • Institutions and their data: Institutions steward vast amounts of high-quality data, from cultural heritage to research and public records. Yet differences in formats, standards, and capacity make this data difficult to share openly and sustainably.
  • Linked Open Data platforms: Platforms like Wikidata and Wikibase support open, structured, and multilingual knowledge at scale. However, contributing institutional data remains labor-intensive and hard to sustain, especially for smaller institutions.
  • Generative AI systems: Generative AI increasingly shapes how people access information, yet it relies on opaque data and makes limited use of structured open knowledge. This constrains transparency, accuracy, and inclusion

Our Approach

[edit]

AI-BRIDGES builds practical bridges between institutions, open knowledge infrastructures, and GenAI. Through research and experimentation, the project explores sustainable ways of connecting data, technology and participation. AI-BRIDGES works across three interconnected strands to strengthen links between institutions, open knowledge infrastructures, and emerging AI technologies.

  • Bridge 1: From institutions to open knowledge - We co-design low-code and no-code workflows that help institutions contribute data to open knowledge platforms more easily and sustainably.
  • Bridge 2: From open knowledge to AI systems - We explore ways to connect structured open data to large language models, enabling more transparent, contextual, and data-grounded AI interactions.
  • Bridge 3: From data to participation - We develop lightweight, participatory approaches that invite students and the public to contribute to data curation without requiring technical expertise.

Why this matters

[edit]

GenAI is reshaping how knowledge is accessed and used across institutions, communities, and research. AI-BRIDGES works to ensure that institutional and open knowledge can meaningfully inform AI systems, in ways that support public-interest knowledge work for those who steward, build, study, and fund it.

  • For institutions: Lower the practical and technical barriers to sharing data openly and responsibly, while keeping it meaningful and usable in an AI-driven landscape.
  • For open knowledge communities: Address long-standing challenges of scale, sustainability, and inclusion, while staying grounded in shared values of openness and transparency.
  • For technologists: Work with large-scale, structured, multilingual open knowledge to explore how it can meaningfully inform AI systems in real-world contexts.
  • For researchers: Engage with rich empirical material across institutions, communities, and technologies, and contribute to interdisciplinary research that connects theory and practice.
  • For funders: Gain a grounded view into an emerging ecosystem of public-interest work at the intersection of data, AI, and the commons, and identify where support can make a lasting difference.

From Bridges to Practice: What we are actually doing

[edit]

This section outlines how we approach this work, what we are actively building and testing, and the principles that guide how we work with different stakeholders. Central to this approach is the idea of “bridges”, which are not treated as metaphors, but as design challenges. Each bridge is translated into concrete workflows, tools, and practices, developed in ways that can be tested, adapted, and reused across multiple settings.

Institutions steward vast amounts of carefully curated data, yet sharing and reusing this data openly and sustainably remains challenging in practice. Open knowledge infrastructures have demonstrated what is possible at scale, but contributing institutional data often requires expertise, coordination, and long-term capacity that many institutions do not have.

At the same time, GenAI systems increasingly shape how knowledge is accessed, used, and produced, while making limited use of structured, community-governed data. As a result, valuable institutional knowledge remains underutilized, and AI-mediated access to information often lacks transparency, provenance and contextual grounding.

AI-BRIDGES starts from the premise that these challenges are interconnected and cannot be addressed in isolation. Rather than treating institutions, open knowledge infrastructures, and AI systems as separate domains, the project focuses on connecting them through practical workflows and shared practices. These connections are developed through multiple, interconnected strands of work (or buckets of work) that run in parallel and inform one another through iteration and collaboration.

Bucket 1:Making Institutional Data Easier to Share and Reuse

[edit]

We work with institutions to understand how their data is prepared, managed, and constrained in practice, including technical, organizational and policy-related factors. Based on this understanding, we design and test low-code and no-code workflows that support institutions in contributing data to platforms such as Wikidata and Wikibase in more feasible and sustainable ways. This work focuses on reducing friction and lowering barriers to participation, while respecting institutional contexts and existing expertise. GenAI is explored as a support mechanism, for example in assisting data preparation or mapping tasks, rather than as a replacement for institutional knowledge or judgment.

Bucket 2:Connecting Open Knowledge and Generative AI

[edit]

We explore how structured, open, and community-governed data can be meaningfully integrated into AI-driven interfaces and workflows. This includes experimenting with open-source language models that enable institutions, researchers and the public to query, explore and analyze data using natural language. Rather than treating AI systems as standalone solutions, this strand focuses on grounding AI outputs in structured data, supporting transparency and reuse, and examining how different design choices affect trust, interpretation, and accountability in AI-mediated access to knowledge.

Bucket 3:Enabling Participation Through Lightweight and Inclusive Approaches

[edit]

We develop and test participatory models that enable students and members of the public to contribute to data curation and enrichment in accessible ways. Building on prior experience with micro-contributions and gamified engagement models, we examine how AI-assisted tools can support participation, while maintaining care, accountability and respect for institutional contexts. This strand pays particular attention to how participation is structured, supported and recognized, and to how learning, contribution and responsibility are distributed across different roles.

Guiding Principles

[edit]

AI-BRIDGES is guided by four commitments (pillars), expressed through 12 core principles, which describe how knowledge is produced, how working with data work is approached, how collaborations are structured, and how responsibility is carried forward over time.

Pillar 1: Research through practice

[edit]

How knowledge is generated, tested, and learned in real-world contexts

[edit]
  • Research grounded in practice, with real-world consequences and impact AI-BRIDGES treats practice as a site of inquiry and responsibility. Research questions, methods and outcomes are shaped by real institutional and community contexts, with attention to how the work affects people, infrastructures, and decisions beyond academia.
  • Building on prior work and collective experience The project builds on existing research, tools, infrastructures, and community knowledge, learning from past successes and failures before proposing new approaches. Innovation is pursued through extension, adaptation and alignment, rather than reinvention.
  • Learning through iterative practice, in real contexts Learning emerges through building, testing and iterating on workflows, tools and practices, in real-world settings, and is supported by ongoing reflection and adjustment, rather than abstract design alone.

Pillar 2: Holistic and inclusive data practices

[edit]

How data work is framed, scoped and designed across contexts

[edit]
  • Holistic approaches to data work AI-BRIDGES approaches data work as a full lifecycle, from preparation and contribution to querying, analysis, reuse and impact. Individual workflows or tools cannot stand alone; sustainable practice requires attention to how data moves across systems and roles over time.
  • Global and multilingual by design The project works across languages, regions, and institutional contexts, including those often overlooked, and treats multilingualism and contextual diversity as foundational design considerations rather than add-ons.
  • Openness with care Open practices are pursued with attention to ethical, cultural, and institutional constraints. Decisions about participation, data sharing, and reuse recognize that not all data or knowledge can or should be fully open.

Pillar 3: Collaboration via partnerships and participation

[edit]

How institutions, communities and contributors work together

[edit]
  • Working across stakeholders, together AI-BRIDGES engages multiple stakeholder groups simultaneously, including institutions, open knowledge communities, technologists, researchers, and funders. Progress depends on coordination and shared sense-making across these perspectives.
  • Institutions as partners, not data sources Institutions are engaged as active partners in shaping the work, with attention to sustained collaboration, shared decision-making, and ongoing contribution rather than one-off data extraction.
  • Flexible and mindful participation Participation is intentionally flexible, allowing contributors to engage in different ways and at different levels, depending on their interests, capacities, and contexts, while maintaining transparency, care, and mutual respect.

Pillar 4: Knowledge sustainability and stewardship

[edit]

How responsibility for knowledge extends beyond individual projects

[edit]
  • Contributing back to the commons AI-BRIDGES treats contribution back to Open Knowledge and Open Data infrastructures as an integral part of the work, sharing outputs, documentation, and insights into communal spaces to support reuse and collective benefit.
  • Designing for sustainability beyond the project The project is designed with the understanding that funding is time-bound, while the infrastructures, practices, and communities it engages with are not. Outputs are developed for durability and reuse beyond the project’s funded period.
  • Open Knowledge governance as an ethical public interest choice In a landscape increasingly shaped by proprietary AI systems, AI-BRIDGES deliberately centers community-governed Open Knowledge and Open Data as foundations for public-interest AI.

Partners and collaborators

[edit]

AI-BRIDGES is built through collaboration. The project brings together individuals, institutions, and communities that share a commitment to improving how data and knowledge are stewarded, connected, and reused in the age of generative AI. Collaborations with AI-BRIDGES span open knowledge infrastructures, cultural, academic and governmental institutions, research communities, AI developers and funders, reflecting the project’s focus on bridging domains that are often disconnected in practice. Contributors engage in different ways and depth, depending on their roles, capacities, interests and strategic goals. What connects them is a shared recognition that the challenges this project addresses cannot be tackled by any single organization, discipline, or region alone; they must be tackled together. If you wish to be added as a collaborator, please contact us.

Core Partners

[edit]
  • WMDE: Core partner supporting the development and alignment of work related to Wikidata and Wikibase infrastructures, drawing on long-standing experience in open knowledge, data modelling, and community-led technical innovation.
  • WMUK: Core partner contributing extensive experience in institutional partnerships, education, and public engagement, and supporting collaboration with cultural, academic, and research institutions in the UK and beyond.
  • WMBR: Core partner bringing deep expertise in Wikidata innovation, technical capacity building, and community development, with a strong focus on multilingual contexts, inclusion, and collaboration across diverse institutional and regional settings.
  • PleiAs: Core partner contributing expertise in the development of open-source language models and AI systems for public-interest and institutional use, supporting exploration of how open knowledge infrastructures can meaningfully inform generative AI.

Additional Contributors and Collaborators

[edit]

In addition to the core partners, AI-BRIDGES collaborates with a growing network of individuals, communities, organizations and institutions that contribute their time, expertise and perspectives to different parts of the project. Contributions take many forms, including sharing insights through interviews, providing metadata samples, engaging in design or technical exploration, research collaboration, participation in the AI-BRIDGES Open Forum and related events, helping connect partners and initiatives working on related challenges or giving valuable advice. Many collaborators engage with the project across multiple roles and contexts. Rather than categorizing participation strictly by affiliation or sector, AI-BRIDGES recognizes this plurality and approaches collaboration as a flexible, evolving process shaped by shared interests and capacities. The partners' page acknowledges and thanks the individuals, communities, organizations and Institutions whose contributions help advance the project and shape its outcomes.

If you'd like to partner or collaborate in any way, please contact us.

Get Involved

[edit]

AI-BRIDGES is a collaborative research project, and participation is central to how the work unfolds. People and institutions engage with the project in different ways and at different levels, depending on their interests, capacities, and contexts.

There is no single or required mode of participation: contributions of different kinds and scales are valued, and as an evolving project, participation is welcome at every stage. What connects all forms of involvement is a shared interest in improving how institutional data, open knowledge infrastructures, and generative AI can work together in responsible, sustainable, and publicly beneficial ways.

You can get involved in several ways: share insights in an interview, share sample datasets, become a thought partner and participate in the Open Forum, join forces on relevant research and additional academic efforts, or simply follow.

Interview: Share Experience and Insight

[edit]

One way to engage with AI-BRIDGES is by interviewing and sharing experiences. The project is gathering perspectives from people who have worked with institutions, open knowledge infrastructures, and data-sharing initiatives, including those who have led, supported, or participated in such efforts. These conversations help build a clearer understanding of existing practices, challenges, and opportunities. Participation typically takes the form of a conversation or interview and does not require long-term commitment.

Contribute: Sample Datasets

[edit]

AI-BRIDGES invites institutions and practitioners to share sample datasets. Contributed samples can support the research in three main ways:

1) Designing and testing the AI-BRIDGES pipeline, including workflows that help institutions prepare and contribute metadata to Wikidata, Wikibase, and related open knowledge infrastructures.

2) Training open source language models that support responsible connections between GenAI systems and Linked Open Data platforms.

3) Curating and sharing sample datasets as open research resources in a dedicated open repository for the wider community, where license-compatible.

These samples help the project understand the diversity of institutional data formats, languages and practices, which then inform the design of workflows and tools that support data sharing and reuse. They make visible the constraints, edge cases and contextual factors that shape how data can be prepared and contributed in practice.

To share a dataset, please fill out the Sample Dataset Contribution Form, and then send your dataset via email.

If you know of existing open datasets that may be used, please share a link via email or this form.

Join: The AI-BRIDGES Open Forum

[edit]

For those interested in ongoing engagement, AI-BRIDGES hosts an Open Forum, a recurring, open meeting space for shared learning, discussion, and collaboration.The Open Forum brings together institutions, researchers, technologists, and open knowledge practitioners working on related challenges. Meetings are semi-structured, typically featuring short presentations or examples, followed by discussion and exchange. The Open Forum is a key mechanism for collaboration across the project. Participation is open and flexible, and meetings are recorded and shared publicly to support transparency and asynchronous engagement. The AI-BRIDGES Open Forum is supported by a dedicated Google Group, where information is shared and curated. Join the group if you wish to take part and follow: https://groups.google.com/g/ai-bridges.

Collaborate: Joint Research and Experimentation

[edit]

AI-BRIDGES welcomes collaboration with researchers, technologists, and practitioners interested in joint exploration, experimentation, or research aligned with the project’s aims. This may include:

  • collaborative research questions
  • shared use of datasets or tools
  • co-design or testing of workflows
  • or exploratory technical work

Rather than prescribing collaboration formats in advance, the project approaches such work through dialogue and mutual interest.

Connect: Follow and Stay Informed

[edit]

Not everyone has the time or capacity to participate directly. Staying informed about the project’s work is also a meaningful form of engagement. Updates, recordings and announcements are shared through the project website and communication channels, allowing people to follow the work, share it with others, or join later as the project evolves. For more details, please check the News & Events tab at the top menu in the project's website.

If you have relevant experience and are interested in contributing in any way, please get in touch via email: contact{{@}}ai-bridges.org.

News & Event

[edit]

AI-BRIDGES at the AI Impact Summit, Delhi, 2026

[edit]

In the coming AI Impact Summit, hosted in India this year, AI-BRIDGES will host an expert panel in collaboration with AI Commons and Public AI.

More details coming up soon.

AI-BRIDGES Open Forum: Monthly meetings beginning Feb 27th, 2026

[edit]

AI-BRIDGES is hosting monthly meetings on the last Friday of the month at 15:00 UK time. The meetings are meant to bring together individual, communities, organizations and institutions that are interested in bridging institutional data, Linked Open Data, and GenAI, with the main goal of aligning efforts and working collaboratively toward practical solutions. Recognizing how precious participants’ time is, the AI-BRIDGES Open Forum is strategically designed to bring value to participants, but adopting a semi-structured approach: each meeting will host 2-3 presenters, who will briefly share an aspect of their work. This could be an experiment they are involved in, a project, tool or workflow designed that can then inform the conversation. Meetings will be recorded and shared in a dedicated playlist, so that participants joining later can always follow and catch up. For more information, details and link to the meetings, please join the dedicated Google Group here: https://groups.google.com/g/ai-bridges.

The first meeting is scheduled for Feb 27th, at 15:00 UK time. To get a link, please register to the google group.

For any further inquiries, please contact us at: contact{{@}}ai-bridges.org.

AI-BRIDGES Symposium:May 28-29, 2026

[edit]

AI-BRIDGES will be hosting a Symposium at the Digital Humanities Research Hub, University of London. This event will bring together experts from various institutions, open knowledge and FAIR data advocates, technologists (especially focusing on GenAI), researchers and funders.

The symposium will be held on two consecutive dates:

In the main Symposium day (day 1, May 28th), will feature some presentations, an expert panel, and working in small groups toward solutions to critical problems identified by participants. presentations and discussions will revolve around the role of GenAI in knowledge preservation and accessibility. The day will focus on finding real, practical solutions to daily challenges institutions and open knowledge advocates have been experiencing for years, as well as more recent challenges brought about by GenAI.

The training day (day 2, May 29th) will focus on capacity building and basic training for institutions and individuals interested in expanding their knowledge of Wikidata, Wikibase and their relations to GenAI. This will include: an intro to Wikidata; setting up a Wikibase; and GenAI-related solutions, such as the embedding project, the Wikidata MCP and vibe coding.

The full program will be available in March, 2026. More details here.

Results

[edit]

TBD

Resources

[edit]

TBD

References

[edit]