What is the problem you're trying to solve?
Wikipedia has an editor problem. In comparison to the first half of its history, Wikipedia has gained fewer editors and new editors have faced a higher threshold to get their contributions to "stick" on articles. It does not, however, have an edit problem. Editors have increasingly turned to semi-automated tools to maintain or improve articles, in many cases these tools are addictive, fun and allow editors to make small edits to articles at a rapid pace.
Many of these edits comprise what we might consider "low hanging fruit" for new editors; fixing typos or references, reverting vandalism or making similarly minor changes. Among those edits, a large portion are performed via semi-automated tools. These tools are almost exclusively kept from new editors for a variety of reasons, namely their potential to allow for rapid edits across a large number of pages. Their very existence is also a secret to many readers and editors, not out of any attempt to hide the tools themselves but because of the significant knowledge of the project and affiliated technical resources required to use some of them.
That's a shame because the tools themselves and the edits they afford are among the most "game like" elements of the Wikipedia editing experience, allowing for editors to get in the "zone" and experience flow, offering structured goals and manageable milestones and potentially setting up opportunities for immediate feedback. And although we don't have a complete theory on what makes elements of Wikipedia game-like or fun, I feel the popularity of these tools among long term editors is a fairly strong sign that something about them is fun and rewarding.
What is your solution?
In order to explore the game-like elements of these tools, we need to do a few things:
Decouple use of the tools from intimate project/technical knowledge. As a practical matter, Wikipedia has probably already reached saturation among potential editors who have the time and technical expertise to operate these tools independently. AWB seems like fun, but it is insufficient to tell new editors (even if it were technically possible) to try it out and see for themselves. Break down the decisions required for use of these tools into smaller chunks. Tools like Huggle, HotCat or AWB offer a baffling array of editing options, many of which are (probably) not of interest to a new or marginally attached editor or which require deep knowledge of the project not immediately apparent to readers.
Make the use of a subset of these tools productive for editors who want an ephemeral experience and safe/useful for long term editors watching over a number of articles. Semi-automated tools are restricted to certain editors because computers are bad at many things and humans are bad at judging the output of a computer. :) Further, a tool to allow for new editors to commit semi-automated changes would require at least some buy-in from the community and we would first have to convince them it would do no harm.
Give editors an objective function that isn't merely "minimize the time to make a decision". Ideally editors should be nudged toward getting the "right" answer, but that's not always a solved problem.
Encourage readers to become editors by allowing small, management contributions to be made in a fun and tactile manner on a mobile device.
Expose editors to the wide range of community created tools, many of which represent the most "game like" behaviors on wikipedia.
- Build a generator or generators for streams of potential edits
- Tools like Huggle consume the recent changes feed (among other sources) for edits which may be vandalism. STiki uses algorithms to produce streams of potential vandalism by observing meta data. Similar streams could be built using non-vandalism related sources, e.g. categorization suggestions for recent articles, typo correction suggestions for bot selected articles or a number of other sources built by either running semi-automated tools on a labs server or building equivalent tools to search for potential edits. By splitting off the generation of edit streams from the app itself we can leverage a lot of existing work in semi-automated tools (some of which have been in development for a long time and are quite robust) and by recording agreement--possibly also noting where changes have been reverted--we can provide real feedback to tool authors on the effectiveness of individual tool components.
- Design a mobile front end
- The front end will consume one or more potential streams of suggested edits and offer a highly simplified interaction to end users. Actions could be different for different types of streams or all streams could be written to allow for a small set of actions. For example, we could have 4 basic actions: approve, decline, edit and view. The data required by the front end should be likewise standardized, perhaps requiring a parsed snippet of the result, the article title and or a diff between revisions. We can give editors on mobile a potentially fun way to make changes to wikipedia with very limited commitment and time investment. The very simple streams will allow us to (potentially) let editors make changes with a gesture and get immediate feedback. By constraining the space for action we can tweak components of the UI or the underlying challenges to try to make them more fun or more effective. The intent is to make this front end a mobile web application first, aiming for users looking to engage with wikipedia in an ephemeral manner. A good mobile web application should work just as well on the desktop (and offer the same functionality) but the goal is to get the mobile interface right from the start. The mobile web is also a great match for tools which have highly constrained actions--we can experiment with how to present the content and how to suggest actions as well as how an editor sees feedback and we can do so without causing information overload.
- Structure a testing or "agreement" system
- Because the error rate for suggested edits from any tool is non-negligible, we need some method for judging the acceptability of an edit before and after it is committed. This may be as simple as requiring more than one editor to recommend the same change before it is committed to the encyclopedia. Tracking agreement on specific edits could extend to streams or editors. If a particular stream produces low quality results (much disagreement over identical edits) it can be tracked or removed. Similarly an editor who consistently disagrees with identical edits made by different editors can be given lower weight when tracking agreement. A testing system allows us to remain honest with ourselves and to test specific edits differently in a way that is totally transparent to the end user. For instance, a stream generates a suggested edit and we could present that to the editor as approval for the change or reverse it (without letting them know) and suggest an edit we suspect is wrong. An editor swiping right for "yes" on everything will quickly fail that test and we can use that to update their likely accuracy or prevent confirmation bias from swamping our system of agreement.
Plan of action
The rough steps for a minimum viable product are:
- Port a single tool from a run on demand service to a stream.
- Build a front end tool to handle the output of a stream
- A command line tool for prototyping actions
- A modern web interface built for mobile first
- Build a back end service for managing distribution of streams, recording of agreement between editors across edits and within streams across editors and scheduling committing of edits
Once a spec is developed and a front end interface to consume streams developed, ports of new tools can be done by any editor and can be integrated into the edits exposed to a mobile user transparently.
Problems with technical solutions:
- Generating a stream of suggested edits from existing tools has an obvious problem. Those same tools can be run on demand by other editors or other editors could edit the articles between when a chunk of edits are suggested and when they are eventually committed. Resolving those conflicts has to be transparent (or preferably invisible) to the end user and not too computationally intensive (read: no repeated polling of articles).
- Tracking distributed changes is hard. It's a solved problem in general, so it's not a mystery, but it isn't trivial or easy by any stretch of the imagination.
- Porting tools designed to be run on demand over a single article into a generator that produces single edit suggestions suitable for quick consumption and action is also hard. For some tools it may be very difficult. For example, fixing all the hypenation errors in an article in one edit is probably useful. Fixing one is likely less fun and definitely less useful.
- Tracking editors over time may be hard. If the front end is a native mobile application it gets easier but if the front end is a web interface (as I'm hoping will be possible) we may not be able to perfectly match humans to actions. We could allow editors to sign in with OAuth but I'm hoping the audience for this will largely be people without accounts so we can't rely on that.
Problems without technical solutions:
- I don't know yet who gets the "credit" for making a suggested change. We cannot group edits from multiple people into a single stream, as that runs afoul of the foundation's policy on shared accounts. We also probably don't want to take away credit from an editor who made a change if we don't have to.
- Agreement may not be a very informative measure of an edit's worth. Semi-automated edits could be independently viewed as valuable but seen as problematic depending on factors which are well outside the scope of a tool like this. For example, if we had a stream which converted British English to American English, editors could disagree on even identical edits and edits which had agreement could be unacceptable to the community if the article in question is not one that should be converted to one variation of english over another. Part of the research phase will entail prototyping various solutions to this problem and engaging the community on which may work best.
- Saying "yes" or "no" may not be fun or interesting for editors.
- Development: 9500 USD
- Front end design (or other consultation): 1500 USD
- Device testing (at least one android phone and tablet): 500 USD
- Additional test devices, hosting and (potentially) compute time: 500 USD
- Submission/travel to GLS or GDC: 1000 USD
Early in the project we'll attempt to bring the community in on how best to judge and commit edits to the encyclopedia using this tool. We know from the start that mobile edits and tool edits are sometimes a cause of frustration to community members and we will seek input on how to let the intersection of those two can reduce the problems.
Before producing a prototype we hope to speak with editors (at meetups and online) about which sorts of edits would be appreciated most by the community and how best to ensure edits can be quickly assessed and streams updated if necessary.
We hope to offer (via the assessment system) feedback to tool authors on which edits suggested by the tools are rejected by users or which sorts of articles are particularly difficult for those tools to handle, upstreaming these suggestions into open source tools.
We would also like to bring in the larger game development community into the project once a fully functional prototype exists. Outside the Wikimedia movement there is considerable interest in how games or game like structures interact with Wikipedia and feedback from these communities on different suggested edits will be valuable.
Regardless of the success of the project as a whole, the specification and examples for streams of edits generated from tools which are normally run as a one off or in batch mode will allow tool authors and community members to integrate more tools into new and interesting front ends (think Huggle but for fixing reference errors)
The assessment system could also be used for any stream of suggested edits (including assessments of vandalism) to update estimates of accuracy or correctness for many different kinds of tools.
Given a specification for streams of suggested edits, the project could be expanded by any interested editor, adding new streams of edits transparently to the end user of the mobile product.
Measures of success
- 1500 users of the tool by the end of a trial period
- 10 Distinct types of micro-edit streams
Primary developer: Adam Hyland (protonk)
- I'm a web developer with experience building web applications for desktop and mobile as well as a wikipedia editor with 6 years of experience (a few of which were as a heavy tool user)
- I also have a background in statistics (and have worked as a consultant on statistical and data visualization projects)
Please paste links below to where relevant communities have been notified of your proposal, and to any other relevant community discussions. Need notification tips?
Do you think this project should be selected for an Individual Engagement Grant? Please add your name and rationale for endorsing this project below! (Other constructive feedback is welcome on the discussion page).
- I fully support this proposal. WMF is just starting to explore microcontributions and we need to engage with the broader editor and research community to set up the microcontribution program for success from as many different angles as possible, to understand what works and what doesn't. Dario (WMF) (talk) 17:51, 20 October 2014 (UTC)