How to find the documentation?
@Markcridge: this is a very interesting report, and I like the approach of "use case driven" documentation, but it's not obvious how I can find that. Could you add a link to the documentation, either here, or ideally within the report? --NF Ford (talk) 07:12, 21 December 2018 (UTC)
- From your new grant proposal, I was able to find the project page for Scenario-based documentation relating to political data in Wikidata. However that only contains two entries, and one of those dates from before your grant. Is the other documentation you produced not yet linked from here, or has it not been written yet? --NF Ford (talk) 07:49, 22 December 2018 (UTC)
- @NF Ford: glad you find the report interesting and thanks for the questions on the documentation. You are correct that the existing documentation that you found predates the report. We actually have a cache of up to date documentation being prepared for early January - this has taken quite a bit longer that we had hoped. With hindsight it would have made more sense for us to release this on a regular basis throughout the project, but we will try to make up for that now. There is also a second release of documentation around the Verification Tool which will be ready once the final development on the tool is complete – we'll pick that up after the holidays. --Markcridge (talk) 12:56, 23 December 2018 (UTC)
What happened with Automated Prompts?
@Markcridge: Your original proposal had three main strands, one of which was Automated Prompts. Your timeline shows initial work beginning on the proof of concept for these in July 2017 and the first prompts entering testing in August 2017. By September 2017 they were undergoing heavy testing to generate plenty of examples, and in your Midpoint Report that month you said they were "expected for wider roll-out imminently".
I can find no information about them after then. None of your three extension requests mentions them, and your Final Report gives no information about them at all, almost like they had never existed.
- @NF Ford: Thanks for your question. Markcridge is away for a couple more days, so perhaps I can answer in the meantime. Since the midpoint report, we’ve made a couple of tweaks to the prompter tool (for example to add a bot flag), and it’s listed in the tools section of our final grant report and on the [everypolitician page]. We think it’s a useful generic tool for monitoring the differences between data on Wikidata and an external source. However, we shifted active development to the the verification pages tool, which builds on the same concepts (monitoring a csv-format output from some external source and comparing it to data that already exists in Wikidata) but uses domain-specific logic (e.g about the structure of p39 statements) to resolve the two big problems that we found with prompts in the context of political position holder data:
- From a prompt, you can see a difference between the source csv and the data in Wikidata, but there's no way to take an action to fix it. The items to action on verification-pages are updated every day from the data source, so serve the same function as prompts in that respect, but in addition they give the user simple options for acting on each difference to update the data in Wikidata.
- Prompts don't work well when you don't have a common ID field to match on - you end up with lots of spurious differences which might be inconsequential changes in names, say - verification-pages get around this in the specific case of political membership data by getting you to reconcile the person with their Wikidata item. --Louisecrow (talk) 13:34, 2 January 2019 (UTC)
- @Louisecrow: thank you for the extra explanation of your experience with prompts. It sounds to me like there was a lot of useful learning here, and it would be a pity if that ended up buried only in these comments. Could you add these points to the final report itself, and perhaps into a Learning Pattern or two? Other projects also have reconciliation issues, so I suspect there's valuable material to share from your experience.
- On top of the tool itself, one of your four Project plan activities was to:
- "Adapt our existing suite of parliamentary scrapers, which monitor almost every national legislature in the world, so that instead of feeding data straight into EveryPolitician, they will prompt human editors to ensure that that information is correctly entered into Wikidata"
- But I can't see anything in your final report about how you got on with that goal. How many of these were successfully migrated? --NF Ford (talk) 04:00, 3 January 2019 (UTC)
- @NF Ford: To add to Louise's comments and your query on the project plan activities we will add what material we can on further learning, but this may come out in parts as we have time available. We're starting to switch the scrapers that populate EveryPolitician.org over the next few months - however that is out of scope of the Wikimedia Grant which this report is specifically for. It is part of our wider work which the project plan refers to, which is funded from a handful of other grant sources. The transfer to using Wikidata as a source for EveryPolitician.org is something that we'll continue to update over the next couple of years as national level main chamber politician data becomes available on Wikidata for each country and as we have the time to tackle this.--Markcridge (talk) 14:24, 7 January 2019 (UTC)
- @Markcridge: Your report has a large banner at the top stating that it is still under review, and seeking comments, responses, and questions. So I had been assuming that you would still want to update the content, especially where you may have overlooked details. A key part of these reports is to share knowledge learned, for the benefit of the community. The "What went well" section, specifically, should take the form of Learning Pattern articles, but currently you only have an ad for a tool you have been building (even though it still isn't even finished), rather than anything applicable to others. But you do seem to have learned lots of useful things you could turn into interesting Learning Patterns.
- Re your point about the scrapers being out of scope of the grant, the Activities section of your proposal listed four stages to the project:
- Building reports
- Populating data
- Migrating your existing scrapers to help Wikidata editors
- Switching EveryPolitician.org to be Wikidata-driven
- Re your point about the scrapers being out of scope of the grant, the Activities section of your proposal listed four stages to the project:
- @NF Ford: I agree we would like to add more detail on what we've learned. And I hope we've been able to give you a sense of the timescale involved in that. Our current focus is adding more general documentation on to the Wikiproject Every Politician page over the next few weeks. The work for parts 3 & 4 to switch scrapers to Wikidata and ultimately switch EveryPolitician.org to be Wikidata-driven is the work of many more years. The way we have written this report is to focus on the activities that were paid for specifically by our grant from Wikimedia, and to give a broad overview of the much larger portion of work that we are involved in beyond this grant. We will continue to contribute to Wikidata beyond the scope of this grant because we think Wikidata is a better place to add this type of open data rather than solely through our own projects because it will be of benefit to more people and we can pool our efforts with other members of the community. We consider the work done under this grant complete, we however agree it would be benefit from additional documentation. The best way to do that is to actually share more documentation on the Wikiproject which we'll try to do as soon as we practically can. --Markcridge (talk) 08:08, 8 January 2019 (UTC)
- @NF Ford: To clear that up. The direction of the bigger 'project' remains the same, but to reiterate the tasks involved in completing stages #3 and #4 will take a lot longer. They were never going to be completed under the tasks we grouped under the Wikimedia grant which this report refers to. They will be funded by other sources as this is work off of Wikidata itself. For the avoidance of doubt the Wikimedia grant we received was only spent on Wikidata specific tasks as outlined in our project budget on [] which included; Establish, document and apply model for structuring political data on Wikidata, Documenting data structures and other support tasks on Wikidata, Organising community events and sharing project activity, Edit-a-thon's and other gatherings to populate country structures + data, Support for community events in priority countries. So the bigger 'project' continues via other means, but for the purposes of this grant we need to report how we spent that money which I think is what we've done. I hope that clears up any confusion. --Markcridge (talk) 09:31, 8 January 2019 (UTC)
- @Markcridge: Your original grant proposal was very explicit about the entire developer budget requested from Wikimedia being for "Retooling scrapers to feed into Wikidata instead of Every Politican".
- You then widened that slightly to "retooling EveryPolitician software for Wikidata", explained by Oravrattas as doing "more than just rewriting scrapers, as that's just one part of what we'll be doing".
- After community feedback you then switched the self-funded and WMF-funded parts of the project around. However, you were still very clear that everything (including retooling the scrapers) would happen during the four months the project was originally supposed to take. You did mention that other things would only happen in the longer term, beyond the timeframe of the grant, but making your scrapers useful to Wikidata editors was not one of these.
- I have already quoted your Activities section ("Adapt our existing suite of parliamentary scrapers, which monitor almost every national legislature in the world, so that instead of feeding data straight into EveryPolitician, they will prompt human editors to ensure that that information is correctly entered into Wikidata"), with no indication that these stages might only happen years later, but other sections of the proposal were also clear that scraper-fed prompts were part of the immediate plan
- Your Solution listed as one of the three approaches: "To help keep the data up to date, we plan to generate automated prompts to notify users of Wikidata when their input is needed. Much of the underlying work for this has already been done: we already maintain a network of over 1,000 scrapers gathering data from a myriad of official sources and parliamentary sites. From these we will signpost and highlight changes in a user-friendly format when information is changed which would need to be updated in Wikidata", along with emphasising that "All three approaches are required".
- Your participation goals described the prompts as critically important: "A key element of success will be ... a steady stream of updates that make it easier to identify what needs to change and when"
- Your budget included £38,500 "over the next four months" for "retooling EveryPolitician software to feed into Wikidata", which we've seen above explicitly included migrating the scrapers to prompts.
- You did indeed only seek partial funding from WMF in your proposal. But the plan was very clearly presented as a coherent whole, and the comments, discussion, and community support was based on the full picture you painted of your project goals. So I do not believe it is reasonable to claim at the end that you only need to report on the parts that were directly funded by WMF.
- The most useful reports share what grant recipients learned over the course of their project. And that is especially valuable where someone was unable to achieve everything they hoped for. Louisecrow has helpfully listed above the big problems you faced with prompts. These seem reasonable to me, and useful to include in the report. And it seems equally reasonable that these problems meant you didn't migrate all the scrapers you originally planned. And it also seems reasonable that you had to ask for grant extensions multiple times as things were taking longer than hoped. But I find it very hard to believe your claim now that you never even intended to migrate any scrapers until much later still, or that we cannot ask questions about this part of the project.
- When you say that you will be "starting to switch the scrapers that populate EveryPolitician.org over the next few months", do you mean that you did not actually do any of Stage #3 at all during the grant period? If this is work still to be funded by other sources, how did you spend the developer funds allocated to this task in the funded proposal? --NF Ford (talk) 03:55, 9 January 2019 (UTC)
- Hello @NF Ford:. This specific report is on the relevant activities for how we spent the Wikimedia Foundation Grant – providing this is a normal requirement of any grant we receive at mySociety as a charitably funded organisation. I think I've made that clear in my answers and the detail is outlined in the report and the budget on the grant page quite clearly. The WMF grant monies were wholly spent on the parts of the project to support Wikidata, as I listed previously this included community events, tooling for the verification tools, initial prompts work and so on. Other than documentation which we're completing in our own time at the moment, all the rest of the work in the project is happening outside of Wikidata. It is not using WMF funds, and we're doing that at our own pace as we can. The WMF grant was only ever going to pay for a portion of the overall project. I don't think I have much more to say on this that would make the situation any clearer as it is now. The WMF grant was a welcome and very useful contribution towards this overall work, and our budget included other grants from Indigo Foundation and other funds we had available. We've been clear and precise about how the grant was spent and we've been clear on what work we'd still like to do and contribute to beyond any WMF funding from our own time and resources. --Markcridge (talk) 07:37, 9 January 2019 (UTC)
- Just a quick comment to agree with NF Ford that their questions help understand and improve the report and the project itself. Thank you! Nemo 09:40, 8 January 2019 (UTC)
@Mjohnson (WMF) and Jtud (WMF): Are WMF happy for mySociety to refuse to answer questions about the parts of the project that were not directly grant funded? It seems to me that when a proposal says "If WMF will fund us to do A, B, and C, we will also be able to do X, Y, and Z through other funding sources", they have still submitted a single project, all of which should be reported on as a whole, and all of which should be open to scrutiny and comment. If the grantee only needs to report on A, B, and C, and never answer questions on X, Y, and Z, or even do any of X, Y, and Z, then that seems to go against all principles of openness, and offer too many opportunities for bad-faith proposals. Each grant round seems to include a few co-funded proposals, so my question extends beyond one single report (though I have not been able to find any other similar case yet where a grantee has refused to discuss the parts funded through other sources). --NF Ford (talk) 06:32, 15 January 2019 (UTC)
In your report you claim "The most important thing we achieved was to encourage more of our peers within the ‘Civic Tech’ and political transparency communities to actively become members of the Wikidata community".
When pushed for further details at your request for a new grant, you listed, as examples, OpenUp in South Africa and Code for Pakistan. You were asked for, but have so far refused to provide, any further examples.
When it was also pointed out that OpenUp have stated that they were paid for their work and you were asked whether that was also true for the other groups, you also did not answer.
Similarly when asked for clarification on how much of the data generated was added directly by mySociety staff, you have again been silent.
I believe these questions to be important not only for your new grant, so I am asking them again here. Of the 39 countries where you say you hit your data targets, which involved data entry by mySociety staff or other groups or individuals paid by mySociety? --NF Ford (talk) 05:15, 23 January 2019 (UTC)