Grants:Project/EveryPolitician

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search
statusunder review
EveryPolitician
summaryEveryPolitician.org is a two year old initiative from mySociety with the aim of gathering together freely open, well structured, and consistent data on every national politician in the world - a toolset and platform that gathers and maintains over 3.6 million data points on almost 73,000 politicians in 233 countries and territories. We are proposing to transition this project over to Wikidata and in the process establish Wikidata as the authority control for coherent, consistent and linked national level political data.
targetThis project will target political data on Wikidata as a whole with a particular focus on 30 to 40 target countries selected from our working shortlist at https://www.wikidata.org/wiki/Wikidata:EveryPolitician
type of granttools and software
amount£40,000 GBP
type of applicantorganization
contact• mark.cridge@mysociety.org
join
endorse
created on10:59, 10 March 2017 (UTC)

Project idea[edit]

mySociety are the UK based charity behind TheyWorkForYou.com, WriteToThem.com, WhatDoTheyKnow.com and FixMyStreet.com. We have been working with political data and running parliamentary services for almost 15 years and our services are used by over 10 million people each year in at least 44 countries. Currently our EveryPolitician project aims to gather free, structured and consistent data on every national politician in the world.
Our intent is to take this deep understanding of the structures and interrelationships of parliaments and politicians around the world and ensure that that is reflected to a much greater extent on Wikidata, working as part of the Wikidata community.
We'll then further help populate this data into Wikidata and the new structures we've created there, establishing Wikidata as the definitive source of political data globally.

What is the problem you're trying to solve?[edit]

What problem are you trying to solve by doing this project? This problem should be small enough that you expect it to be completely or mostly resolved by the end of this project. Remember to review the tutorial for tips on how to answer this question.

Data on politicians and governments in Wikidata, whilst extensive, is often incomplete and variable in quality. Without consistent and linked data it’s almost impossible to make practical use of the existing data for democratic and accountability initiatives. Wikidata’s unparalleled reach and its supportive community should in principle be the definitive source of political data globally - BUT from our analysis only about half the current national-level legislators in the world have Wikidata entries.

Coverage of individual constituencies/districts is substantially less than 50%, and coverage is very low for other core concepts of parliamentary data which don't lend themselves so neatly to Wikipedia pages. For example, we estimate there is only 10-20% coverage of legislative terms (such as start and end dates).

We have spent the past two years mapping, gathering and remapping just this political data for national level politicians in 233 countries around the world (we’re missing just 12 at the moment). With the dataset we've built up and the experience we have developed in this area we can work together to ensure Wikidata becomes the definitive source for consistent and comprehensive political data by integrating our EveryPolitician.org project with Wikidata.

What is your solution?[edit]

For the problem you identified in the previous section, briefly describe your how you would like to address this problem. We recognize that there are many ways to solve a problem. We’d like to understand why you chose this particular solution, and why you think it is worth pursuing. Remember to review [Grants:Project/Tutorial|the tutorial]] for tips on how to answer this question.

Our Solution: Combine the huge amount of data gathered by EveryPolitician's catalogue of scrapers with the editing power of the Wikidata community.

We'll use the data and experience we've gathered through EveryPolitician to establish a complete set of hierarchies and relationships between politicians, legislatures, positions, etc in Wikidata in 30 to 40 priority countries. We already synchronise EveryPolitician data with Wikidata where it exists – currently around 37,000 politicians, 1,800 parties, 4,000 elections, 1,400 constituencies and 150 parliamentary terms are pulled into EveryPolitician from Wikidata, but we could do so much better than that by switching our effort to ensuring Wikidata is more up to date and accurate.


To help to document and evolve a consistent data format to provide comprehensive coverage of politicians in Wikidata we’ve already begun to map what exists and what needs to be updated here: https://www.wikidata.org/wiki/Wikidata:EveryPolitician

Combining EveryPolitician and Wikidata
Our current services comprise the EveryPolitician.org website; a network of over 1,000 scrapers gathering all of the data from a myriad of official sources and parliamentary sites; tools for using the data including library code in Ruby, Python and PHP; and a github repository where we gather and categorise the multiple sources of data into a coherent and consistent whole. Significantly EveryPolitician also provides the data in easy-to-use formats, JSON and CSV, to allow anyone who can use a spreadsheet to make use of the data.

Most of our scrapers revisit their sources once every 24 hours, resulting in daily updates. Changes are automatically highlighted and manual efforts can be focused on maintaining scrapers when elections occur (there’s an average of one national election every week around the world) and deepening the dataset to include more biographical, demographic and historical information.

Whilst we can't just import all of the EveryPolitician data into Wikidata, our experience of gathering and structuring this data over the last few years can help us work with the Wikidata community to establish better models for how political data can best be structured for both ease of entering and ease of querying, build reports that can show errors and omissions in the models and data, and use our scraper network to signpost and highlight changes that have not yet been reflected in Wikidata.

Project goals[edit]

What are your goals for this project? Your goals should describe the top two or three benefits that will come out of your project. These should be benefits to the Wikimedia projects or Wikimedia communities. They should not be benefits to you individually. Remember to review the tutorial for tips on how to answer this question.

  1. Wikidata will end up with free, open, well-structured and consistent data on politicians and legislatures, which is an essential ingredient for underpinning parliamentary monitoring services, campaigning, anti-corruption and digital democracy initiatives – a key requirement of good governance, transparency and accountability efforts.
  2. This project will improve authority control for political data on Wikidata, building on the two years of work to date on EveryPolitician, and enabling many more individuals and organisations to make use of the data.
  3. The key value to practitioners and researchers comes when the information is entered in a consistent enough manner that tools built to work with data for one country can be easily adapted for others, and simple multi-country analysis is possible without spending a long time adjusting scripts to cope with different modelling decisions in different countries. Tools and services built on Wikidata political data will better allow citizens to interact with their elected representatives. Citizens gain access, sometimes for the first time, to information and services that can help them do anything from writing to their MP to finding out how their representative voted on a bill that truly matters to them.

Project impact[edit]

How will you know if you have met your goals?[edit]

For each of your goals, we’d like you to answer the following questions:

  1. During your project, what will you do to achieve this goal? (These are your outputs.)
  2. Once your project is over, how will it continue to positively impact the Wikimedia community or projects? (These are your outcomes.)

For each of your answers, think about how you will capture this information. Will you capture it with a survey? With a story? Will you measure it with a number? Remember, if you plan to measure a number, you will need to set a numeric target in your proposal (e.g. 45 people, 10 articles, 100 scanned documents). Remember to review the tutorial for tips on how to answer this question.

  1. We will create a series of reports to highlight errors or omissions in existing political data within Wikidata. By showing at a glance which countries have no P194 or P1313 set (or multiple values, or obviously incorrect values), it will be much simpler for users to see data that they can add or fix. Or, by showing at a glance which positions are part of each country's cabinet (Q640506) during successive governments and who held those positions at the time, it becomes much more obvious where there are gaps or problems. We expect to add over 1000 such reports during the project, all of which will continue to be useful indefinitely.
  2. By the end of the project Wikidata will have well structured, accurate, consistent data, and up-to-date information on all current (and where available a significant number of historic) legislators and cabinet-level officials, for a minimum of 30 to 40 countries, with the model in place to raise this to at least 70 to 80 countries in the next few months through community support.
  3. As each country's dataset becomes more complete on Wikidata we will switch EveryPolitician to draw in its source from Wikidata rather than the scraper network. It will then be up to the community volunteers in each country to maintain the data and EveryPolitician becomes an easy to use front end for accessing and making use of that data.
  4. Using this approach, mySociety has already enabled new projects including Rada4You, a vote tracking site in Ukraine; and a campaign tool by Oxfam allowing people to easily write to their politicians about issues they care about. Our data has already benefited existing projects such as Politwoops keep their lists of politicians’ profiles up to date. We’ll expect to see many other existing third party democracy projects make use of the political data in Wikidata.

Do you have any goals around participation or content?[edit]

Are any of your goals related to increasing participation within the Wikimedia movement, or increasing/improving the content on Wikimedia projects? If so, we ask that you look through these three metrics, and include any that are relevant to your project. Please set a numeric target against the metrics, if applicable. Remember to review the tutorial for tips on how to answer this question.

Our primary task will be to establish the core political data for each national government, legislature, and content pages on each politician for each of target counties – this will include structured data on the parliaments, politicians, parties/factions, constituencies, roles, elections, and legislative terms. As a guide within EveryPolitician we currently hold around 3.6m data points on over 73,000 politicians in 233 countries.

Aim: populate a complete dataset of current politicians and linked data in each of our 30 to 40 target countries – very roughly this would be about a fifth of current EveryPolitician data, so around 750,000 to 1,000,000 individual data points.

A key element of what we’ve learned from EveryPolitician is working with partners and community members with both local knowledge, but also the ability to source the multiple language, spelling and pronunciation variants for each politician. mySociety has built its international practice on working with in-country partners – we've developed and support digital services in 44 countries around the world, always working with a local NGO or campaigning organisation who actually run the services day to day, whilst mySociety provide technical support and training.
A key element of success will be to help the Wikidata political data community and volunteers in each country get to the point that there is both sufficient data to make it worthwhile to maintain, and providing a steady stream of updates that make it easier to identify what needs to change and when. For a country's data to be maintained and kept up to date we'll need to find local champions from within the community who have the knowledge time and motivation to keep their country's data updated.

Aim: identify and support the key community volunteers in each of our 30 to 40 target countries. Will be around 100 volunteers to help support.

Project plan[edit]

Activities[edit]

Tell us how you'll carry out your project. What will you and other organizers spend your time doing? What will you have done at the end of your project? How will you follow-up with people that are involved with your project?

Our proposal involves four stages;

  1. Undertake preparations on Wikidata so that it can be ready to accept consistently structured data on parliaments, politicians, parties/factions, constituencies, roles, elections, and legislative terms and similarly structuring existing EveryPolitician data so that it can be entered in the correct way to Wikidata. This will involve generating reports across every country to show more clearly the data that already exists for each of these entities, and selecting the 30 to 40 priority countries from our initial list at https://www.wikidata.org/wiki/Wikidata:EveryPolitician
  2. Over a three to four month period work with the Wikidata community to populate the dataset drawing upon the identified sources from EveryPolitician, through a combination of group hackathon and individual contributor efforts, across our initial set of 30 to 40 priority counties.
  3. When complete this will allow Wikidata to establish authority control for the politician dataset and we will use our suite of parliamentary scrapers to monitor almost every national legislature in the world, but adjust these so that instead of feeding that data straight into EveryPolitician, they will report on any changes in a manner suitable for people with local knowledge to ensure that that information is correctly entered into Wikidata. We will further develop a wide range of queries and tools to determine (on an ongoing basis) gaps and inconsistencies in all the political data already in Wikidata, and help clean these up.
  4. Once complete we will redevelop EveryPolitician.org as as Wikidata-driven front-end to political data, showing off what is possible when this data is available, and consistently modelled, in a way that is difficult to do within Wikidata or even Wikipedia. From here we turn our attention to making use of the data through our own services and helping other people in more countries do the same.

Budget[edit]

How you will use the funds you are requesting? List bullet points for each expense. (You can create a table later if needed.) Don’t forget to include a total amount, and update this amount in the Probox at the top of your page too!

In order to undertake this change of direction we will need additional short term financial support in order to employ a contract Delivery Manager based in the UK, Event Support both in the UK and some of our priority countries and additional editorial resource to help lead the community efforts in updating the data on Wikidata.

Our total project costs are around £120,000 GBP over the next 4 months to pay for our existing four person EveryPolitician team and support all of the new Wikidata transition effort. We have secured funding of £30,000 from one of our existing donors, we have £50,000 for our existing core funding remaining and are seeking a grant of £40,000 from Wikimedia to make up the shortfall.

(Note: we've updated the budget tasks to better reflect how Wikimedia funds would only be spent on Wikidata tasks, not on external tasks on EveryPolitician following community feedback to this proposal).
The full budget for the whole project over the next four months is;

Role/Cost Item Function Cost/Day (Unit) Days (Units) Total Funder Funder Total
Delivery Manager Overall Project Management & Delivery £400 75 £30,000 Match Funder £30,000
Senior Developer Retooling EveryPolitician software to feed into Wikidata £360 50 £18,000 mySociety
Developer Retooling EveryPolitician software to feed into Wikidata £260 50 £13,000 mySociety
Junior Developer Retooling EveryPolitician software to feed into Wikidata £150 50 £7,500 mySociety
Partnerships Manager Ongoing project support and development £260 50 £13,000 mySociety £51,500
Senior Developer Establish, document and apply model for structuring political data on Wikidata £360 25 £9,000 Wikimedia
Developer Establish, document and apply model for structuring political data on Wikidata £260 25 £6,500 Wikimedia
Junior Developer Documenting data structures and other support tasks on Wikidata £150 25 £3,750 Wikimedia
Community Manager Organising community events and sharing project activity £240 20 £4,800 Wikimedia
UK Community Events Edit-a-thon's and other gatherings to populate country structures + data £2,000 3 £6,000 Wikimedia
International Community Events Support for community events in priority countries £1,000 10 £10,000 Wikimedia £40,050
Total £121,550

Community engagement[edit]

Community input and participation helps make projects successful. How will you let others in your community know about your project? Why are you targeting a specific audience? How will you engage the community you’re aiming to serve during your project?

This project will only be successful in the long run with the active involvement and support from Wikidata community members in each of the priority countries.
mySociety can help by transferring our knowledge of legislature and political data structures over to Wikidata from our work on EveryPolitician, but we will need to support and work with people with local knowledge,experience and judgement.

Working as part of the Wikidata political data community in each country
We will work as part of the Wikidata political data community on best practices for modelling not just politicians but all the surrounding concepts that make this data valuable: legislative terms, constituencies (both geographic and conceptual), political parties and factions, elections, elected vs appointed roles, cabinet and other executive branch positions, etc. across the wide range of different systems that make up the world's national legislatures (based on work of doing exactly that over the last 2 years with EveryPolitician).

It is our hope and assumption that once data is essentially "complete" in many countries, the existing Wikipedia/Wikidata communities of those countries will have significantly more incentive not only to use the information, but to keep it up to date. This will require quite a lot of effort to get this going organically in more than a handful of countries, which is why we’ll prioritise countries with a strong political data community effort and this initiative will give enough of a kickstart to keep everything up to date in the long run.

Finally we'll also encourage the hundreds of people from our existing EveryPolitician user community to start using Wikidata directly, which well help make the most of the people we have already built up relationships with and are expert in the use of such political data in their own fields and bringing more people like ourselves into the Wikidata project. This includes developers, campaigners and academics across out network.

Get involved[edit]

Participants[edit]

Please use this section to tell us more about who is working on this project. For each member of the team, please describe any project-related skills, experience, or other background you have that might help contribute to making this idea a success.

mySociety - https://en.wikipedia.org/wiki/MySociety
mySociety is a UK-based charity that invents, and popularises, digital tools and services that enable citizens around the world to exert power over institutions and decision makers.
Founded in 2003, we are recognised as a global pioneer in the use of the internet to help citizens perform civic and democratic tasks - helping establish the Civic Technology sector in the process. We operate across four practice areas: Better Cities, Freedom of Information, Democracy, and Research to determine what creates impact and why.
Today we are one of the most active civic technology organisations in the world, supporting the work of partners in 44 countries, and operating services used by more than 10 million people every year.

Tony Bowden - https://www.wikidata.org/wiki/User:Oravrattas
Tony has been with mySociety since 2009. He’s on the road a lot of the time, meeting with groups around the world on our behalf but for the last two years he’s been travelling a little less, because he’s busy leading the EveryPolitician project. EveryPolitician is Tony's brainchild coming from his deep understanding that it's impossible to create services to hold politicians to account if you're not starting with good quality, consistent data.
Before joining mySociety, Tony founded Blackstar, one of the UK's earliest online retailers; built an email search tool that the launch of Gmail made obsolete almost overnight; helped turn around Ireland’s oldest ISP in part by rewriting most of their systems to use Semantic MediaWiki; and worked on merging wikis with spreadsheets at Socialtext.

Chris Mytton - https://www.wikidata.org/wiki/User:Chrismytton
Chris joined mySociety as a developer in April 2013. He’s contributed to the Pombola and FixMyStreet projects, and is currently part of the EveryPolitician team.
Before joining mySociety, Chris worked as a freelancer and for various web development agencies.

Dave Whiteland - https://www.wikidata.org/wiki/User:Beholderstories
Dave joined mySociety as a developer. Now he works on the EveryPolitician project, helping people use the data, and wrangling the EveryPolitician bot.
As a programmer, he has cut code across a range of our projects, including FixMyStreet, ePetitions, and PledgeBank. He’s also done plenty of work on the Alaveteli and FixMyStreet documentation. For several years he was part of our international team, travelling widely to meet organisations and help them set up websites using our code.

Oliver Denman - https://www.wikidata.org/wiki/User:ODenman
Oliver joined mySociety in 2016, as a Junior Developer on the EveryPolitician team. He’ll be helping to write the code that enables us to provide useful, structured politician data for activists, developers and researchers around the world.

Mark Cridge - https://www.wikidata.org/wiki/User:Markcridge
Mark is Chief Executive of mySociety. He has enjoyed a diverse 20-year digital career including stints as COO at BERG, the technology and design consultancy, and as a senior advisor at Blue State Digital in London. He got his start in 1996 working for a small web design agency in Birmingham before setting up glue London, a digital advertising agency in 1999, going on to become global managing director of Isobar, following glue’s acquisition in 2005.

Community notification[edit]

Please paste links below to where relevant communities have been notified of your proposal, and to any other relevant community discussions. You are responsible for notifying relevant communities of your proposal, so that they can help you! Depending on your project, notification may be most appropriate on a Village Pump, talk page, mailing list, etc. Need notification tips?

Endorsements[edit]

Do you think this project should be selected for a Project Grant? Please add your name and rationale for endorsing this project below! (Other constructive feedback is welcome on the discussion page).

  • I'm part of the EveryPolitician team so I know how much this work would benefit from being opened out to the whole wikidata community — prior to EveryPolitician I've worked on mySociety parliamentary monitoring projects around the world and have first hand experience of what a costly barrier simply getting basic political data can be. Beholderstories (talk) 15:40, 13 March 2017 (UTC)
  • I've worked with Tony Bowden on some of the initial Wikidata modelling for politicians. It's clear that there is a lot of work to be done but also a lot of potential value to be opened up here - this basic data can provide an important infrastructure for building all sorts of useful things. There are already a few small WD projects working at various aspects of this but having someone able to devote the time and energy to pulling this work together, to establish a system for bringing data in from external services, and to help give the community a high-level overview of what's available, would be invaluable.
The fact that MySociety are able to put up about two thirds of the cost of this suggests that it would be a fairly productive grant in cash terms. From a sustainability perspective, this also offers a way to engage new audiences with Wikidata and hopefully gain more motivated contributors and maintainers from the existing digital civic society communities. Andrew Gray (talk) 21:41, 13 March 2017 (UTC)
  • Easy access to consistent data and structured data on politicians would be very helpful 82.4.191.3 02:36, 14 March 2017 (UTC)
  • Reputable organisation doing vital work, and a good partner for Wikimedia projects. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:25, 14 March 2017 (UTC)
  • We at Factmata are also interested in tracking political promises and having a dataset of comments and claims by politicians we can a) track and b) automatically fact check. Any project to make this data more linked and structured will add value to all efforts. Dhruvghulati (talk) 11:10, 14 March 2017 (UTC)
  • Essential work. Great organisation as well. EdSaperia (talk) 12:06, 14 March 2017 (UTC)
  • What a great idea! MySociety have a strong reputation in the OpenData arena and this gets my support no end. CalzGuy (talk) 12:25, 14 March 2017 (UTC)
  • No brainer. As users of this data this is something https://represent.me would hugely support, too. eddowding 14:59, 14 March 2017 (UTC)
  • mySociety are a great provider of services, their reputation in the civic tech community is second to none. Definitely a good partner for Wikimedia. I run Parli-n-grams which uses their data sources heavily, and which would strongly benefit of this proposal. Giuseppe Sollazzo (Talk to Giuseppe) 15:21, 14 March 2017 (UTC)
  • Indigo Trust have supported MySociety with grants since 2009 and have also supported Wikimedia Foundation. Indigo funds transparency projects in sub-Saharan Africa, where we have developed the concept of the 'Accountability Stack' - the basic information that is necessary for a functioning civil society. A critical part of that stack is widespread public knowledge of who the politicians are - even in the UK this can be hard to find out. EveryPolitician would fill a conspicuous gap. Indigo can attest to MySociety's excellent project delivery track record and efficient use of funds. We commend this proposal. The preceding unsigned comment was added by Wililamperrin (talk • contribs) 16:19, 14 March 2017 (UTC)
  • I was involved in the setup of Wikidata, and this proposal is very much why we wanted to have a global, free knowledge base that everyone can edit. And I do know MySociety, and they are a reliable partner who will deliver what they promised in this proposal. Pavel Richter
  • mySociety are very efficient at this sort of project and it will certainly have widespread use. Francis Davey (talk) 20:36, 14 March 2017 (UTC)
  • This will give Wikidata a huge boost in an area of great importance to all of all of us! NavinoEvans (talk) 22:00, 14 March 2017 (UTC)
  • I'd like to support this project from the Wikidata dev team side. The kind of data being talked about is highly relevant and something we get asked about regularly. We were even thinking about pushing more in this direction over the next year with data partnerships. I am very happy to see that the proposers are already working on Wikidata and seem to have a good understanding of how it works and how they need to be involved in the community. The use of the data in MySociety's projects makes me hopeful that the data will not just be dumped and rot but actually used and maintained in the long run. --Lydia Pintscher (WMDE) (talk) 10:55, 16 March 2017 (UTC)
  • This seems to be exactly the sort of useful thing Wikidata is meant for. MySociety have a long track record of doing good things iwth data, and this is only one part of the funding so the risk of simply wasting money is low. Neonchameleon (talk) 15:18, 17 March 2017 (UTC)
  • I'm working on a similar project on a much smaller scale (Dicare about French lower house parliamentarians). It seems that mySociety can have the resources to start a global and collaborative work in this domain, that would certainly be of benefit to Wikidata and all Wikimedia projects. -- Envlh (talk) 18:53, 21 March 2017 (UTC)
  • The proposal matches the WM charter and fits with a practice of importing resources from collaborative organizations. Putting the information in WD makes it more available to the public and could have interesting but unforeseen possibilities. Glrx (talk) 19:22, 21 March 2017 (UTC)
  • A very interesting proposal which could be used as a model for other organisations John Cummings (talk) 21:35, 11 April 2017 (UTC)