Grants talk:Project/MHz Curationist/Building a sustainable system that unlocks museum metadata for Wikidata use
Add topicQuestion from Joalpe
[edit]Hey all (pinging Dominic as grant proponent). Thank you for the proposal. I am writing here as a member of the project grants committee to have a better assessment of the proposal being made here.
- As knowledge equity is mentioned in the proposal as being part of the vision of this development, which is awesome. However, I would like a better understanding of the institutions you are considering here. Is this specifically geared towards smaller, less resourceful institutions in the Global North? Or if you are considering a more universal approach: what steps are being taken to bring institutions from the Global South into this vision? I was surprised no mention of translation was made, which I would assume suggests an English-only system.
- Could you please specify what role each participant will take in the project?
- Could you please provide more details on how the budget was estimated/calculated?
- As this proposal is being led by a non-Wikimedia organization, is it possible to have a better account of previous activities being done by the MHz Foundation with the Wikimedia community? Were there previous grants from this organization applied for by this organization? How does this external funding relates --if it relates-- to a sustained process within the community?
I hope these questions make sense. Thank you! --Joalpe (talk) 15:54, 31 March 2021 (UTC)
- @Joalpe: Thanks for these questions. We definitely apprecicate the opportunity to expand on some of these issues. Sorry for the delay in replying, I just wanted to make sure you know I've been working with the team over the last few days in getting a response! Dominic (talk) 01:57, 7 April 2021 (UTC)
- @Joalpe: Please see the answers below. These are written in first-person by myself for the sake of presenting a single coherent response, but were written/vetted collaboratively with the rest of the Curationist team's input.
- Thanks for this question, I think it is definitely worth fleshing out more, because we do not want to give the impression that the reference to equity was only rhetorical. In fact, MHz is an international team with a strong track record when it comes to telling underrepresented narratives, and sees this as one of the central purposes of the platform—with its editorial and curated content, Curationist can take museum content and use it to proactively focus on those untold stories. As they state in big words on their home page: "Geographic diversity, anti-colonial, anti-racist, feminist, and queer practices, are among those that guide how we build our teams, how we deliver content, and how we cultivate our community." Also, here are a couple of examples from Curationist's editorial content that show how they use open access museum collections to tell these stories: "Westward the Course of Empire Takes Its Way—how a monument to Manifest Destiny became enshrined in the US Capitol Building" & "Golestan Palace: a Landmark Reflecting 150 Years of a Changing Tehran". MHz Foundation and its curatorial advisors are partners of knowledge equity organizations such as Whose Knowledge? and Local Contexts.
In terms of your specific question about translation, multilingualism is a platform goal for Curationist, which is currently still in beta. Recently, Curationist's metadata consultant, Sharon Mizota, wrote about just this, describing the challenge of "creating a taxonomy for categorizing and describing the assets on the site, which may come from anywhere in the world, in any language, and range from paintings and sculptures to eBooks and 3D models." Within the schema are fields for translation. These fields will be accessible through our Community Metadata Generator (CMG) tools (which this project is a foundational dependency of). The desire for multilingualism is actually a key reason Wikidata has been chosen as the source of Curationist's controlled descriptive vocabulary.
Finally, in terms of incorporating Global South institutions, we are fully expecting this project will enable that vision. We want to reiterate that this project is intended not just to provide data from those initial institutions we listed in Goal 2, but to develop a system by which any GLAM data set could be imported and continuously synchronized with Wikidata. The first institutions were chosen solely because they have been identified as requiring the least initial effort to pull off, because of factors such as ease of use and because these were institutions with which members of the team already had familiarity. Working with these existing available open access APIs (from Global North orgs who had the funding and support to make these APIs available) we will learn from the process of building the system for the explicit purpose of making that capacity available to underfunded, marginalized, and/or Global South GLAMs—and I think that is actually one of the proposals main selling points, to set up a tool that would continue to be available to institutions long after the grant period ended.
- In terms of participant roles, Virginia is the product owner for Curationist, while Thomas is the technical project manager across the whole platform. They are the principals on the grant for MHz Foundation, and would also be responsible for the maintaining the project post-grant period.
For the work of the software development, this is split between tasks that backend and Curationist platform-facing and those that are Wikimedia-facing. The backend work (i.e., enabling the Curationist workflows and aggregation data model support the needs of Wikidata integration) would be performed by Datacrafted, who are the firm MHz has already contracted for all the platform's backend development and database management beyond this project.
The Wikimedia-facing tasks, that include building a bot script, would need to be undertaken by an experienced Wikimedian who understands, potentially, Pywikibot, Wikidata API, OpenRefine, etc., as well as the ability to communicate with the Wikidata community. I am attached to this project myself as an advisor, and have done some similar type of work that this project idea evolves from, namely developing the Cleveland Museum of Art's Wikidata bot—which you can learn more about at the current bot request discussion—as well as many other thousands of automated edits (both to Wikidata and Wikimedia Commons) for my past positions at the US National Archives and Digital Public Library of America. I would likely be the developer carrying out all or most of this Wikimedia-facing work, or if I was not able (or we decided to split the work), I would definitely be the responsible team member–using my deep connections to the Wikimedia community—to find someone suitable for that role. Additionally, the project is being advised by User:LoriLee, past Wikimedia Foundation US Cultural Partnerships Coordinator, and her digital agency 1909 Digital (which I work through) would also be able to assist in ensuring that the work of creating a Wikimedia bot, and related activities, would only be carried out by a community member with the skills and credibility to accomplish it.
- We worked with our existing teams of developers, designers and technical project managers to scope out the level of effort required to build what we are proposing and map that to the costs associated with their time and skill-sets. Specifically, we broke the project down into the discrete components in #Activities, and different components would be accomplished by the team members with the applicable skill sets, who have also provided the cost estimates. The first tasks in the "Prep Work on Curationist Environment" will be done by the software development firm DataCrafted.io, which is familiar with the site’s technology, as it is already the main backend developer for the platform. They have provided those cost estimates.
For the components that require specialized Wikimedia-related knowledge, such as the bot development, I have provided this cost estimate myself based on my experience having already done similar work before. I previously developed a bot that synchronizes Cleveland Museum of Art metadata to Wikidata. This envisioned bot would have similar functionalities, but starting from a different data structure, and with some additional features (synching data in both directions). This also includes time for the intellectual labor of the initial data mapping itself, which is necessary for the project. Aside from the software development, we have also estimated additional cost for the community outreach component, because we are certainly aware there will be time involved in engaging the community in raising awareness, seeking input on design, applying for bot approval, and continually monitoring for error reports or other feedback once the system is in operation.
- We are certainly fully aware that the MHz Foundation has not yet undertaken a large-scale Wikimedia project, and is a first-time applicant for grants this year. There are a couple of other points important to consider, though: (1) in its mission and outlook, MHz is a peer organization in the open knowledge and knowledge equity space, so it is a natural partner for Wikimedia, and (2) having already launched the beta version of Curationist.org in 2019, the MHz Foundation has a track record in undertaking complex technical projects. The mission of the MHz Foundation is "to connect people with global cultural resources and perspectives through open knowledge". I encourage you also to read their 2020 Annual Report to learn more about the organization. So it is important to understand that MHz is an aligned organization that understands working with open communities.
What is also important is that the MHz Foundation is not itself a single cultural institution, but a knowledge organization whose goal is, like the Wikimedia community, to take available open access cultural collections of the world and add meaning and value to them through aggregation, curation, and context. That is why we developed this specific proposal, as it is one project that meets a need of the Wikimedia community, but also happens to be work that is very much in their wheelhouse and achievable for them, having already laid much of the technical groundwork. My feeling, as a long-time Wikimedian myself, is that while MHz may seem new, they actually are one of the best-positioned organizations to undertake a project such as we are proposing—in particular, because the aggregation work that makes this possible in the first place is already part of their platform, and it will happen without itself requiring Wikimedia funding. Another organization that was not already engaged in aggregation and metadata mapping from open access institutions would have a harder time and require more budget to accomplish the outcomes promises. The purpose of this Wikimedia funding is that it will primarily contractor (non-staff) costs necessary to set up the envisioned system, but we see it as sustainable because the idea is that once developed, it will continue to be used and maintained by the Curationist staff who are already funded. The MHz Foundation submitted grants in the intitial community organizing round of this year. You can see that neither of them were as strong as this one, because they were not led by the Wikimedia community, and I actually advised them, while helping with this proposal, that it would be better to withdraw those and focus efforts here where there is the clearest benefit to the Wikimedia community. In addition to my involvement, MHz Foundation has also been advised in this project by User:LoriLee and Neal Stimler, both long-time Wikimedians active in GLAM-Wiki, who worked with MHz in establishing their Wikimedia engagement initiative, and helping to connect them with the Wikimedia community. Dominic (talk) 07:39, 7 April 2021 (UTC)
- Thanks for this question, I think it is definitely worth fleshing out more, because we do not want to give the impression that the reference to equity was only rhetorical. In fact, MHz is an international team with a strong track record when it comes to telling underrepresented narratives, and sees this as one of the central purposes of the platform—with its editorial and curated content, Curationist can take museum content and use it to proactively focus on those untold stories. As they state in big words on their home page: "Geographic diversity, anti-colonial, anti-racist, feminist, and queer practices, are among those that guide how we build our teams, how we deliver content, and how we cultivate our community." Also, here are a couple of examples from Curationist's editorial content that show how they use open access museum collections to tell these stories: "Westward the Course of Empire Takes Its Way—how a monument to Manifest Destiny became enshrined in the US Capitol Building" & "Golestan Palace: a Landmark Reflecting 150 Years of a Changing Tehran". MHz Foundation and its curatorial advisors are partners of knowledge equity organizations such as Whose Knowledge? and Local Contexts.
Eligibility provisionally confirmed, Round 2 2021 - Research and Software proposal
[edit]We've provisionally confirmed your proposal is eligible for review in Round 2 2021 for Research and Software projects, contingent upon:
- confirmation that the project will not depend on staff from the Wikimedia Foundation for code review, integration or other technical support during or after the project, unless those staff are part of the Project Team.
- compliance with our COVID-19 guidelines.
Schedule delay
Please note that due to unexpected delays in the review process, committee scoring will take place from April 17 through May 2, instead of April 9-24, as originally planned.
- Please watch your talkpage, which will be the primary method of communication about your proposal. We appreciate your timely response to questions and comments posted there.
- Please refrain from making changes to your proposal during the scoring period, so that all committee members score the same version of your proposal.
- After the scoring period ends, you are welcome to make further changes to your proposal in response to committee comments.
COVID-19 planning for travel and/or offline events
Proposals that include travel and/or offline events must ensure that all of the following are true:
- You must review and can comply with the guidelines linked above.
- If necessary because of COVID-19 safety risks, you must be able to complete the core components of your proposed work plan _without_ offline events or travel.
- You must be able to postpone any planned offline events or travel until the Wikimedia Foundation’s guidelines allow for them, without significant harm to the goals of your project.
- You must include a COVID-19 planning section in your activities plan. In this section, you should provide a brief summary of how your project plan will meet COVID-19 guidelines, and how it would impact your project if travel and offline events prove unfeasible throughout the entire life of your project.
Community engagement
We encourage you to make sure that stakeholders, volunteers, and/or communities impacted by your proposed project are aware of your proposal and invite them to give feedback on your talkpage. This is a great way to make sure that you are meeting the needs of the people you plan to work with and it can help you improve your project.
- If you are applying for funds in a region where there is a Wikimedia Affiliate working, we encourage you to let them know about your project, too.
- If you are a Wikimedia Affiliate applying for a Project Grant: A special reminder that our guidelines and criteria require you to announce your Project Grant requests on your official user group page on Meta and a local language forum that is recognized by your group, to allow adequate space for objections and support to be voiced).
We look forward to engaging with you in this Round!
Questions? Contact us at projectgrants wikimedia · org.Marti (WMF) (talk) 05:39, 17 April 2021 (UTC)
Aggregated feedback from the committee for Building a sustainable system that unlocks museum metadata for Wikidata use
[edit]Scoring rubric | Score | |
(A) Impact potential
|
5.6 | |
(B) Community engagement
|
5.2 | |
(C) Ability to execute
|
5.0 | |
(D) Measures of success
|
4.6 | |
Additional comments from the Committee:
|
Opportunity to respond to committee comments in the next week
The Project Grants Committee has conducted a preliminary assessment of your proposal. Based on their initial review, a majority of committee reviewers have not recommended your proposal for funding. You can read more about their reasons for this decision in their comments above. Before the committee finalizes this decision, they would like to provide you with an opportunity to respond to their comments.
Next steps:
- Aggregated committee comments from the committee are posted above. Note that these comments may vary, or even contradict each other, since they reflect the conclusions of multiple individual committee members who independently reviewed this proposal. We recommend that you review all the feedback carefully and post any responses, clarifications or questions on this talk page by 5pm UTC on Tuesday, May 11, 2021. If you make any revisions to your proposal based on committee feedback, we recommend that you also summarize the changes on your talkpage.
- The committee will review any additional feedback you post on your talkpage before making a final funding decision. A decision will be announced Thursday, May 27, 2021.
Marti (WMF) (talk) 04:27, 5 May 2021 (UTC)
Responses
[edit]Thank you to all committee members for reviewing this proposal! There were some common elements in the feedback, so in an effort to streamline our responses, we have arranged them into the table below. The table includes only comments where we saw a question or critique. We have then highlighted, with different colors for each theme, and the link after each highlight leads to the corresponding response below. Also, since one comment noted that communication has been done so far only by me, I want to make particular note of the fact that these comments are drafted by the core MHz team, with only advice from me. The words you see below are about 90% written by Christian Dawson, the Executive Director of the MHz Foundation—I have volunteered to post all of the grant communications under my account as the most experienced Wikimedian, but you are hearing directly from the team. Dominic (talk) 16:48, 11 May 2021 (UTC)
|
|
|
|
|
|
|
|
|
|
|
|
- Concern about "Global North" focus
We've mentioned working with the Rijksmuseum and Smithsonian, so it's natural to assume a Global North focus. Our focus is Global South / Global Majority, and our prime goal is the development of baseline infrastructure that will allow small GLAM institutions to do what the Rijksmuseum and Smithsonian have done, to contribute to the Creative Commons easily through a set of technical tools designed for them. To do that effectively, we need to make sure that the tools we are building are compatible with ways that GLAM institutions in the Global North operate, because we are focused on data normalization, and taking stock of their methods to ensure interoperability is therefore required. The Smithsonian and the Rijksmuseum don't need help telling the stories of their cultural objects, and indeed this is not an attempt to fund a project that is primarily for their benefit. Instead, we are leveraging Smithsonian's large and diverse collection mostly as a proving ground to help us prepare for the future work with subsequent institutions.Our core focus is to raise under-represented voices, which means focusing our attention on cultural objects in the Global South / Global Majority and empowering the curators and the digital storytellers there to be able to have as effective a platform for their voices as the larger Global North GLAM institutions do.
While we are also engaged in building partnerships with Global South institutions on our own—we are also highly receptive to the suggestion that was made that if WMF also has receptive Global South GLAM institutions they could connect with this project, we would like to work with them.
- "Vagueness" concerns
Perhaps to clarify our request, we should give a bit of broader context. Curationist started as a platform for curators to produce museum-quality online exhibitions leveraging the Creative Commons, focused on CC0 and CC-BY content. We had a dream of opening up the platform to anybody who wanted to use it to curate an editorial exhibit. In doing so, we wanted to pull from a number of global GLAM sources with a focus on smaller institutions so that we could help raise underrepresented voices. We had some challenges in doing so, because we found that each GLAM institution used a different taxonomy and metadata schema, and different API calls with different access requirements, making it really difficult for curators to run an effective search because results looked different depending on where they originated. We started dreaming about what a data store of CC0 and CC-BY content would look like with normalized metadata and taxonomy, then we started building it. We've now got a schema, built on VRA Core 4 with elements from Dublin Core and Schema.org, and we're using it to try to normalize the GLAM content in the Creative Commons, while maximizing the power of metadata, after importing metadata from various sources into a single, consistent data structure. We are exploring the pedagogical potential of metadata, and as we pull all this Creative Commons GLAM content into a single source, we'll be looking at adding folksonomy tags, traditional knowledge notices, translations and transcriptions, alt-text (as poetry), verification/correction notices, Indigenous or Diasporic place names, cultural re-contextualizations, and more!
So that we're clear, that's the project as it stands without the Wikimedia Project Grant funding.
What happens with the Wikimedia funding is that we then have the funds to tie our work to yours—to make sure that all this pedagogical work that we are doing benefits the Wikimedia community too, and so that it's easy to contribute all that metadata work back and forth. We're funding our work, you are helping fund the connection of our work to yours.
- Prior community involvement
It's true we've had our heads down focused on our own platform and our own GLAM data store, metadata schema and taxonomies. We've spent time over the past couple of years trying to get to know the Wikimedian community, but we are not insiders. We have therefore called on Wikimedian insider friends as collaborators to help put this proposal forward—people we know can help us to perform the work and ensure it gets done in a manner that maximizes the benefit to the Wikimedia community, and does so in ways that advance Wikimedian ideologies.Our team lead on this project, Dominic Byrd-McDevitt, is a longtime Wikimedian in Residence and bot developer/operator, and has been an active member of this community for many years. In him and other Wikimedian resources he is helping to bring to the team, we are making sure that it can be an effective conduit for the needs of your community as we forge the connections between our important work and your own. All of the community-facing aspects of this project—such as engaging the Wikidata community for input and navigating the bot approval process on Wikidata—would be undertaken by knowledgeable Wikimedians. We also note the endorsement of users like Susanna Ånäs, which indicates that the core community members we'd be working with are not concerned about our ability to work with the Wikimedia community after they have met with us.
- Hesitancy related to funding institution
The way we see it, this is the kind of project that has an obvious multiplier effect, with the potential of providing tremendous quick, quality benefit to the community far greater than the costs associated with the project. This is because it builds on work that is already grant funded, even though the scope of that grant doesn't include Wikipedia linkages. By focusing only on funding those linkages, you get access to a project that would otherwise be an island apart from the Wikimedian community. Simultaneously, the Curationist project would benefit from the linkages by being able to more easily collaborate with Wikimedians on GLAM object metadata. In the process, it is our hope that our project would bring new volunteers to your community, and as our focus is on the Global South / Global Majority, when we target those communities for metadata work, they would immediately become familiar with the interoperability with Wikimedia data, bringing more diversity to your own community. The added benefit of funding an institution is that the product of this project will be carried forward into the future by an organization whose professional staff will continue to maintain and extend the platform. There is also a benefit to working with a professional organization that has a larger focus, potentially giving the project more longevity than if an ad hoc group of individuals were coming together for this one tool. We understand that there is some hesitancy to funding an external institution, but in this case the funding request is not for general operations but for a very narrow project to develop connective tissue between our important project and your own—work not covered by the scope of our existing grant.
- Specific technical question
One reviewer was uncertain about "how Curationist will be able to cope with different data formats and will solve all existing troubles with mapping various data". Since this is a very important technical question, we would gladly go into in more detail on this specifically, because this has definitely been considered—in fact, it's one of the main technical problems Curationist is solving. We'd like to note, for clarity's sake, that the scope of the grant does not specifically cover this work. This is a benefit, because one of the reasons the Curationist team is well-positioned to take on the proposed project is because its existing funding already enables all the development for the metadata aggregation that is the prerequisite for the work proposed here. Since the Curationist platform itself already depends on this data, the aggregation will be developed regardless of whether the project is funded, and is already underway. And new data sources will continue to be brought into Curationist as they are mapped and their data is accessible via API or some other means. It could even be possible for the Wikimedia community to communally develop mappings for new sources to be incorporated.
What exactly does this precursor work include? Curationist has already designed a common metadata content standard which can accommodate all the data from the initial data sources. These data sources' APIs have been surveyed, and mappings have been created from their data models to Curationist's. Our backend developer has already been contracted with outside funding to develop the system which will harvest the metadata records from these sources and ingest them into the Curationist dataset. Each time a user of the site conducts a search via the site's UI, Curationist's internal search API queries across all of the data sources, and any results that are not yet already in Curationist's database are collected, so the database is continuously updated. Building on all this prior work, it is from this database that a single mapping and pipeline to Wikidata will be developed to allow any Curationist source to be imported to Wikidata.
Round 2 2021 decision
[edit]This project has not been selected for a Project Grant at this time.
We love that you took the chance to creatively improve the Wikimedia movement. The committee has reviewed this proposal and not recommended it for funding. This was a very competitive round with many good ideas, not all of which could be funded in spite of many merits. We appreciate your participation, and we hope you'll continue to stay engaged in the Wikimedia context.
Comments regarding this decision:
While the project has many worthwhile components that were appealing to the committee, and the project team includes excellent advisors, the MHz Foundation is new enough to experimentation with Wikimedia platforms, and working with Wikimedia communities, that we would like you to see you start out with something smaller first.
Next steps: Applicants whose proposals are declined are welcome to consider resubmitting your application again in the future. You are welcome to request a consultation with staff to review any concerns with your proposal that contributed to a decline decision, and help you determine whether resubmission makes sense for your proposal.
Over the last year, the Wikimedia Foundation has been undergoing a community consultation process to launch a new grants strategy. Our proposed programs are posted on Meta here: Grants Strategy Relaunch 2020-2021. If you have suggestions about how we can improve our programs in the future, you can find information about how to give feedback here: Get involved. We are also currently seeking candidates to serve on regional grants committees and we'd appreciate it if you could help us spread the word to strong candidates--you can find out more here. We will launch our new programs in July 2021. If you are interested in submitting future proposals for funding, stay tuned to learn more about our future programs.Marti (WMF) (talk) 22:02, 27 May 2021 (UTC)