Wikimedia CH/Grant apply/Documentation for 2025 Wikidata Graph Split
Infodata
[edit]- Name of the project: Documentation for 2025 Wikidata Graph Split
- Amount requested: 5000 CHF
- Type of grantee: Group
- Name of the contact: Lane Rasberry, user:bluerasberry
- Contact: lanerasberry
gmail.com
In case of questions, please write to grant
wikimedia.ch
The problem and the context
[edit]What is the problem you're trying to solve?
[edit]The Wikimedia Foundation split the Wikidata SPARQL endpoint in April 2025. Users of Wikidata:WikiCite tools and processes are especially affected, as Wikidata queries which rely on WikiCite will not function until and unless they are modified with updates for the new Wikidata infrastructure. There is a transitional period where users may use a "legacy graph" which will be functional through December 2025, but the split is to be completed by January 2026, and all WikiCite content users must update their tools before then. This project seeks to document the tools and processes which must be updated to remain usable. The WikiCite metrics indicate that there are tens of thousands of unique annual users, hundreds of Wikimedia editors, and dozens of institutional partners who use and develop affected WikiCite content, and who would use documentation from this project to understand and react to the changes.
As background, Wikidata is undergoing major changes as the Wikimedia Foundation is splitting its data into two graphs. This is documented in d:Wikidata:SPARQL query service/WDQS graph split and as a May 2024 article in The Signpost. One of the split pieces will essentially be data of the d:WikiCite project, and the other will be everything else in Wikidata (including some WikiCite data too). The general effect of this is that all citations which seek to match Wikidata items for scholarly publications with any other Wikidata content will break unless rewritten, and possibly even then. WikiCite as a project is one of the most developed Wikidata projects, and as such, this split is certain to affect thousands of users. A pilot split began in 2024, the transitional period ends after December 2025, and after that point everyone will need to use the two separate Wikidata graphs to access this content.
None of this is easy to explain to casual users. Additionally, the Wikimedia Foundation is planning major technical upgrades to Wikidata in 2027, and if Wikidata and WikiCite stakeholders are to give community feedback to guide those upgrades, then it is necessary to better describe the problems, possible solutions, and the various user communities which develop WikiCite content.
What is your solution to this problem (please explain the context and the solution)?
[edit]We will produce documentation in English which will support Wikidata users in understanding the graph split and in identifying activities related to the graph split where they can contribute. We will invite community feedback throughout the process, including on issues related to translations.
Project goals
[edit]- Establish and maintain an FAQ on the graph split
- Note what Wikidata activities change, and what tools / workflows/ users are affected
- We will pay special attention to questions posed by those who work on translations
- Convert existing meeting documentation on the graph split into a narrative on Meta-Wiki
- Report the results of the Scholia user survey 2024 to give insight to a WikiCite user community, and to guide a future survey
- Convert Scholia hackathon outcomes from October 2024, November 2024, and April 2025 into documentation associated with the overall WikiCite project
- Do storytelling for the proposals of planned Wikidata developments, and why they matter
- Benchmarking of potential Blazegraph replacements (Qlever, Virtuoso, any others)
- Loading data dumps into a graph engine, then running federated queries with any number of SPARQL endpoints
- Re-assessing what content is desirable and reasonable for the Wikidata community to curate in Wikidata
- Rewriting Wikidata queries for federation
Project impact
[edit]How will you know if you have met your goals?
[edit]The above listed goals are for documentation. When that documentation exists, then this project will have met its goals.
The documentation will include the following:
- A WikiCite FAQ published to Meta-Wiki
- document presenting the results of the Scholia survey
- A narrative of the development of Wikimedia community responses to the graph split, published to Meta-Wiki
- A citable summary with a DOI, published for the benefit of Wikimedia researchers so that they can cite a publication when they need to explain the basics of Wikidata's scholarly content, WikiCite, Scholia, and other scholarly data curation processes in Wikidata
- Basic instructions for re-writing Wikidata queries to comply with Wikidata's new infrastructure, where scholarly content is in one graph and other content is in the Wikidata main graph
- An article in English Wikipedia's Signpost newspaper
Do you have any goals or metrics around participation or content?
[edit]This project's goals for user participation include identifying, acknowledging, and counting active contributors to WikiCite governance and policy development. We estimate that there are about 30 people active in this right as of 2025. This does not include WikiCite contributors who number in the hundreds, or WikiCite users who number in the thousands.
The metrics for content are simply the publication of documents as listed in the goals.
Project plan
[edit]Activities
[edit]- Hiring contractors to assist in documentation writing.
- Drafting documentation; seeking WikiCite community response, criticism, and comment on those drafts.
- Finalizing presentation of the drafts in time for the WikiCite 2025 conference in August.
- Organizing an online event in September aimed at assisting community members who translate the documentation.
Budget
[edit]5,000 CHF
This money will be used to hire contractors to produce the content, and administrative costs associated with transferring the money to them. Project coordination and management of contractors will come from Lane Rasberry, user:bluerasberry, who is Wikimedian in Residence in the School of Data Science at the University of Virginia in the United States. Lane is grant writer here, and is paid in kind from his institution. He will not take this money, but will hire the contractors and pay them with it.
Community engagement
[edit]
Community engagement in support of this effort has already included multiple community events including the April 2025 Scholia Hackathon, dozens of documented meetings regarding the graph split among Wikimedia community members and Wikimedia Foundation developers, and participation in the upcoming WikiCite 2025 conference. There are options in talk pages related to WikiCite which invite the usual Wikimedia community engagement, participation, and reactions.
Decision
[edit]I confirm the grant. As it's a rapid grant, you don't need a fiscal sponsor. --Ilario (talk) 21:41, 19 April 2025 (UTC)