Grants:Project/John Cummings/Wikimedian in Residence at UNESCO 2017-2018/Midpoint
Welcome to this project's midpoint report! This report shares progress and learning from the grantee's first 3 months.
- 1 Summary
- 2 Methods and activities
- 2.1 Non public activities
- 2.2 New Dataset Imports space
- 2.3 Data Imported into Wikidata
- 2.4 Collaboration with Structured Data on Commons team
- 2.5 WikidataCon 2017, attendance and presentation
- 2.6 Wikimedia Conference
- 2.7 Wikipedia training for Swedish Government Delegation at UNESCO
- 2.8 Working with UNESCO staff to share knowledge
- 2.9 Documentation:
- 2.9.1 Wikidata Open Data Publishing
- 2.9.2 Requests for comment/Mapping and improving the data import process
- 2.9.3 Wikidata in Wikimedia Projects
- 2.9.4 An overview of Wikimedia and outlines areas of potential collaboration for the UN
- 2.9.5 Page design and formatting
- 2.9.6 Mix n' Match matching tips
- 2.9.7 Extracting Graphics from Open License Publications
- 2.10 Wiki4Women, International Women's Day event
- 3 Midpoint outcomes
- 4 Finances
- 5 Learning
- 6 Next steps and opportunities
- 7 Grantee reflection
We have spent the first part of the grant focussing on building the groundwork for the second part for the grant, activities includes:
- Lots of documentation
- Creating a working space on Wikidata to collate and import datasets
- Importing data from UNESCO into Wikidata
- A large gender gap event
- Non public activities that will be shared with WMF staff
Methods and activities
Non public activities
There are some outcomes that can not currently be made public, we have shared these with WMF grant staff.
New Dataset Imports space
Many organisations are interested in sharing their data with Wikidata but there was no clear path to do so. Additionally there was no way to understand what datasets exist on a topic, which of those have been imported into Wikidata, or how up to date previously imported datasets are.
Wikidata Dataset Imports is a first attempt at centralising the record of dataset imports from external sources into Wikidata. It’s hoped that this will address the problems listed above and help us move towards a more consistent approach to data import and synchronisation with external sources.
Wikidata Dataset Imports is the main starting point when a new dataset is being imported (or proposed for importing). A FormWizard form was setup in order to make the process as easy as possible, particularly for people who are unfamiliar with WikiText syntax.
A considerable amount of work went into the FormWizard component of this page. After the draft was created, Chris “Jethro” Schilling from the Wikimedia foundation kindly donated lots of his time as one of the few people in the community with knowledge of FormWizard. He completed all of the initial setup of FormWizard configuration files and built a new template that could interface with FormWizard to automatically categorise the resulting pages. Chris also taught us how to use FormWizard for ourselves over a series of Skype sessions. Following this, Nav was able to take over the ongoing improvements of the FormWizard template for the dataset import pages.
We’ve now completed all of the initial testing and the new pages are ready for use. They will of course need continual improvement based on user feedback.
Data Imported into Wikidata
- Matched countries and geographic regions from data to Wikidata items, needed for importing most subsequent UIS data.
- 'Number of out of school children' for countries, continents and geographic regions, including historical values dating back to 1999 (2033 statements in total). Tweet with showcase queries
- Total Fertility Rate from UNESCO Institute for Statistics, covering most countries in the world from 1999 - 2014 (3200 statements in total). Tweet with showcase queries
- Matched 11000 publications from Directory of Open Access Journals dataset (including 7700 manually matched by John with some community help in Mix'n'Match).
- Matched 5000 publishers listed in the DOAJ to Wikidata items (3600 manually matched in Mix’n’Match).
- Imported ISSN and/or EISSN id statements for all 11000 journals listed in the DOAJ data (around 15,000 edits in total).
- Imported 10,700 licence statements for journal items (bubble chart showing CC licences used by Journals listed in the DOAJ)
- Imported 5000 country statements, added to matched publishers used in the DOAJ.
- Full list of imports and queries shown on the dataset summary sheet
Note: The vast majority of the data imported during this project will occur in the second half of the grant period. Initially much more effort has been put into the data import process, so that it can be can be tested and improved during the planned imports from UNESCO Institute of Statistics and other UN agencies.
Collaboration with Structured Data on Commons team
Initial discussions with Sandra about overlapping requirements for the data import process and the upcoming Structured Data on Commons (meeting notes).
WikidataCon 2017, attendance and presentation
Attendeded Wikimedia Conference and gave a presentation on Wikidata Dataset Imports. The conference provided many ideas and clarifications on approach and process of working with Wikidata. This led to the creation of the RFC on the data import process.
Wikipedia training for Swedish Government Delegation at UNESCO
I ran an afternoon workshop with the Swedish delegation to UNESCO after the Wiki4Women event to help them better understand Wikimedia projects.
We are continuing to work with UNESCO staff to share their knowledge on Wikipedia through reusing existing UNESCO text from publications and the website. Having created the process, documentation and instructions to share open license content on English Wikipedia, we have work with Wikimedia Argentina and Wikimedia France to translated it into French and Spanish.
We now have 255 pages which reuse open license text from UNESCO, these pages receive 4.3 million page views per month.
Many organisations are interested in sharing data on Wikidata and more widely with the public. Whilst there are many high quality individual resources for open data there were no guides taking organisations through the whole user journey of understanding and publishing open data.
Whilst we have achieved significant improvements with the documentation on Wikipedia, there is so much more to do, outside the scope of this grant and the time available. Currently the Wikidata import process is a bit of a dark art, with many steps poorly understood outside of the group of people already doing it. To help collate existing resources and plan the new resources needed for different kinds of contributor we started an RFC.
Some WIkimedia projects, especially English Wikipedia have hostility towards the use of Wikidata on their projects. This is at least in part due to a lack of understanding of Wikidata, the page provides information to try to avoid, improve or correct any issues and address common misunderstandings or concerns about Wikidata. It will also provide an overview of what is and isn’t currently possible when reusing data from Wikidata on other Wikimedia projects and improvements planned in the future.
Several UN agencies are starting to release content under an open license showing interest in collaborating with the Wikimedia movement. There are several people independently working with the UN from different chapters and user groups. The page provides an overview of Wikimedia and outlines areas of potential collaboration between UN agencies and Wikimedia with examples.
Over the past few years I’ve created many documentation pages which have used many existing styles from other resources, I’ve also created a few myself. There was no easy way to reuse the styles people had created. This page provides a structure to share formats, including where they are used and ‘blanks’ that people can reuse more easily. The page also has many of the resources needed to get started with understanding the basics of creating more attractive pages, without needing to learn everything first.
Having worked on many dataset imports it became clear that there were inconsistencies between the matching different people were doing, with some instructions being ambiguous. These tips try to minimise these variations and errors.
1000s of organisations produce publications under an open license but do not release the graphics contained within them as individual assets, meaning the end user must extract them themselves and upload them to Commons. There was no documentation of this process.
On International Womens Day UNESCO, Wikimedia France, Les sans pagEs and Wikimedia Foundation collaborated on an event for International Womens Day at UNESCO HQ in Paris.
- Over 200 participants including ambassadors and the Director General of UNESCO learned how to write Wikipedia articles and importance of bridging the gender gap on Wikipedia.
- The Director General of UNESCO attended, wrote her first Wikipedia article (Yuhyun Park) and spoke about the event and the importance of Wikipedia and addressing the gender gap on French TV.
I created pages for online participation and worked with Women in Red to create some basic guidance for new editors on how to write articles about women (9 simple rules). The pages created for the event were designed to minimise the learning curve for new contributors and are available in the 6 UN languages.
The event was funded by the governments of Canada, Iceland, Sweden, as well as the European Union, Fondation CHANEL, Institut national de l'audiovisuel, France Médias Monde. This demonstrates a clear interest from these organisations that they understand the value of addressing the gender gap and could possibly approached for further funding in the future. The page design and guidance has been reused by Wikimedia UK on their 14 workshop collaboration with Amnesty International.
- Dataset Imports pages and documentation pages: People can understand and import data more easily and collaborate with others.
- Page design and formatting: Wikimedia contributors are able to build attractive pages more easily.
- Overview of Wikimedia for the UN: UN agencies have a resource to understand how to collaborate with Wikimedia.
- RFC: An outline of the direction for improving the process and documentation for Wikidata, a central place to plan improvements.
- Talking to the SDC team: An understanding what the structured data on Commons team is doing and how this interacts with our goals.
- Data imports: A framework for importing data about Open Access by mapping all the OA journals and journal publishers.
- Conferences: More people know what we have been doing and we understand understand the wider context of our work better.
- UNESCO staff sharing knowledge: Building a body of work to show the benefit of using open license text on Wikipedia, specially for organisations considering adopting Open Access. Wikipedia has more information.
- Wiki4Women: 200+ people trained how to contribute to Wikimedia projects. A clear indication from delegations they are interested in working with Wikimedia projects.
We have spent our funds according to plan with the exception of having a delayed schedule.
The best thing about trying something new is that you learn from it. We want to follow in your footsteps and learn along with you, and we want to know that you are taking enough risks to learn something really interesting! Please use the below sections to describe what is working and what you plan to change for the second half of your project.
What are the challenges
Many of the challenges of the project are not specific to our project, but more structural, affecting all contributors to Wikimedia projects, especially new contributors:
- Harassment and rudeness to new editors with little recourse or discouragement.
- Poor quality instructions for the basic functions of Wikipedia and Wikidata.
- Misunderstanding of Conflict of Interest guidelines by experienced editors.
- Data added to Wikidata is often made incorrect by other contributors, there are no tools to monitor, maintain and correct errors so data added cannot be trusted.
- Most data imported into Wikidata is done by individuals who are not communicating with the community about their efforts. No central place to track or discuss what has or hasn’t been done yet.
- The main tools used in the data import process need a lot of problems fixed and new functionality added. You need to resort to editing via the API for certain types of data that are not fully supported yet.
What is working well
- UNESCO very willing to run projects and promote open licensing.
- Access to many other UN agencies interested in open licensing and sharing content on Wikimedia.
- Help from many different chapters and user groups to work on projects.
- Appreciation for documentation produced.
- Gathering community feedback about issues with the data import process that need solving.
- New dataset imports area has made it significantly easier to add new import projects and monitor the status of existing data imports.
- Use of FormWizard forms and Visual Editor to reduce the need for advanced knowledge of wikitext when when interacting with the data imports area. This process can be used as a proof of concept for other places on Wikimedia projects. (learning pattern coming soon)
Next steps and opportunities
- Presentation at Wikimania.
- Continue to work on documentation.
- Import more data from UN agencies.
- Continue to work with IGOs on adopting open licensing.
- Work on technical aspects of sharing content on Wikimedia projects.
- Develop Wikiproject United Nations further to make it easier for people to find content from UN agencies.
- Continue to develop sharing best practices for UNESCO which can be adopted more widely.
- Write blog posts for work done
- Promote the Wiki Loves Competitions
There are many structural issues within Wikimedia that are making many projects, not just ours, have issues. I am only able to mitigate some of these issues by using my social network within Wikimedia, something that isn’t available to many people, especially not new people.
Documentation is poor in many areas which leads to many poor outcomes:
- Lack of understanding for new contributors what the rules are
- Lack of direction for new contributors
- Potential new editors are dissuaded from contributing.
- Wikipedia editors have trouble understanding Wikidata, which significantly contributes to the general resistance to Wikidata driven content in Wikipedia.
There is also a high potential for damage to partnerships through poor behaviour of community members which is part of wider harassment, community health and kindness issues on some Wikimedia projects.
It will be difficult to form data partnerships until we have better systems in place for tracking completeness and/or ‘protecting’ data imported from a data partner. Many organisations won’t bother because of the difficulty in maintaining an up to date version. The proposed ‘signed statements’ feature on Wikidata will greatly help in this area once developed.
As all of the tools used by editors are created and maintained by volunteers, there is always a greater need for fixes and new features than can currently be supported by the developers. Help in maintaining these tools from Wikimedia Foundation developers or encouragement from WMF for wider community involvement would have a big impact on the effectiveness of these tools, and hence the number of contributions.
Wikimedia Foundation were understanding and flexible to issues that arose in the logistics of setting up the transfer of the grant funds.