Grants:IEG/The Wikipedia Adventure/Final

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search
Individual Engagement Grants This project is funded by an Individual Engagement Grant
proposal people timeline & progress finances midpoint report final report


TWA small overlapping circles.png
The
Wikipedia-logo-white.svg
Adventure


Summary[edit]

Instead of frustrating and difficult, could learning to edit Wikipedia be engaging and fun? This project set out to answer that question and experiment with one approach to doing so. The method we used is 'gamified onboarding' through an interactive guided tour, a journey through Wikipedia. Here's what we found:

  • TWA players made more edits: New editors who played TWA made 1.2x more edits than a control group of similar but non-invited new editors. Players made 1.9x more edits than those who were invited but did not play the game.
  • TWA players were more likely to make 20+ edits: TWA players were more likely (1.2-1.7x) to make 20+ edits than either control group. TWA players were also more likely to make 0 edits than the control groups, however.
  • Players who finished the game made the most edits: Players who completed the game made 3.2x more edits than those who only started the first level of the game, and were 2.9x more likely to make 20+ edits.
  • Players enjoyed the experience and felt more confident: 87% of players surveyed were satisfied or very satisfied overall with the game. 89% said 'TWA made me more confident as an editor.' TWA player: "It really left me feeling prepared to make future edits." 89% said, 'Lots of new editors should be invited to play TWA.'

A possible explanation for these findings: new editors who play TWA appear to be making their test edits within the game rather than on articles. Those who continue to edit Wikipedia after playing demonstrate the confidence of a more highly active editor.

Part 1: The Project[edit]

Methodology[edit]

The notion of turning Wikipedia into a game might sound like an inappropriate idea to many, so I was sure to draw a clear line between a game that teaches and a game which actually edits. The Wikipedia Adventure exists in its own world--actually in the user's own user-space--and it's neatly marked off as separate from Wikipedia's real articles...


Activities[edit]

The game went through many phases, from brainstorming to bugfixing and betatesting. Here's what we did:


Outcomes and impact[edit]

"Well, what's there not to like, or to have an opinion on...the game is great, most-of-all for us users that are just starting up in Wikipedia."

Quantitative analysis[edit]

We analyzed 3 groups of 165 editors each in November/December 2013. All were selected from Snuggle's desirability algorithm as likely good-faith contributors. 10,000 editors were invited to play using a mass talk page message invitation.

  1. Control group A was not invited to play
  2. Control group B was invited but did not play
  3. TWA Player group was invited and did play

All groups made 1 edit before they were sampled, and made at least 1 edit afterwards, to ensure that they didn't just wander away. In other words, groups 2 actually saw the invites, and group 3 made an edit after the game as well. We found that:

  • TWA players made more edits: New editors who played TWA made 1.2x more edits than a control group of similar but non-invited new editors. Players made 1.9x more edits than those who were invited but did not play the game.
  • TWA players were more likely to make 20+ edits: TWA players were more likely (1.2-1.7x) to make 20+ edits than either control group. TWA players were also more likely to make 0 edits than the control groups, however.
  • Players who finished the game made the most edits: Players who completed the game made 3.2x more edits than those who only started the first level of the game, and were 2.9x more likely to make 20+ edits.

A possible explanation for these findings: new editors who play TWA appear to be making their test edits within the game rather than on articles. Those who continue to edit Wikipedia after playing demonstrate the confidence of a more highly active editor.

A note on the analysis: we ran this data after a very brief amount of time--barely 4 weeks. A fuller timeframe for analysis which we will conduct by Spring 2014 will be more robust and have more meaningful signals in them. (We plan to conduct statistical significance tests and reports about editor retention and edit persistence).

Total average edits[edit]

Group Average edits No edits No edits % 20+ edits 20+ % Median edits
Control A, not invited 12.11 13.00 8.44% 18.00 11.69% 3.00
Control B, invited non-players 8.05 21.00 13.55% 11.00 7.10% 3.00
TWA players 15.57 63.00 38.89% 23.00 14.20% 2.00


Editors who played TWA made 1.2x more edits than the pure control group and 1.9x more edits than the invited group that did not play. Players who made it to Mission 7 made 3.2x more edits than those who only started Mission 1. Those who played were 2.8 to 4.6 times more likely to make no edits, but 1.2 to 1.7 times as likely to make 20+ edits. TWA finishers were 2.9 times as likely to make 20+ edits as those who only started TWA.

Comparison Average edits No edits 20+ edits Median edits
Players vs. non-invited control 1.2x 4.6x 1.2x .33x
Players vs. invited-non-players 1.9x 2.8x 1.8x .33x
TWA finishers vs. TWA starters 3.2x .54x 2.9x 4.0x


Article space edits[edit]

Group Average edits No edits No edits % 20+ edits 20+ % Median edits
Control A, non-invited 7.68 37.00 24.03% 14.00 9.09% 2.00
Control B, invited non-players 5.72 46.00 29.68% 8.00 5.16% 2.00
TWA players 10.51 78.00 48.15% 18.00 11.11% 1.00


The trends identified above were equally present when looking only at article-space (NS0) edits. Players made 1.2-1.9 times more average edits, but only half the median edits. Again, players were 1.5 to 2.1 times as likely to make no edits, but also 1.2 - 1.8 times as likely to make 20+ edits. Those who completed the game were far more productive than those who only started it, with 2.4x average edits and 1.5x median edits.

Comparison Average edits No edits 20+ edits Median edits
Players vs non-invited control 1.2x 2.1x 1.2x .5x
Players vs invited-non-players 1.8x 1.6x 1.8x .5x
TWA finishers vs. TWA starters 2.4x .64x 2.4x 1.5x


Talk page edits[edit]

"I didn't know there was talk and discussion among users until I played the game...I just thought you could make comments and report on individual pages."
Group Average edits No edits No edits% 20+ edits 20+ % Median edits
Control A, not invited 3.06 121.00 78.57% 5.00 3.25% 0.00
Control B, invited non-players 1.26 120.00 77.42% 2.00 1.29% 0.00
TWA players 2.59 114.00 70.37% 6.00 3.70% 0.00


Most editors did not go on to make any talk page edits, but there were still some interesting trends. Editors who were invited to play and played made 1.7 times as many talk page edits, and those who completed mission 7 versus only mission 1 made 3x as many talk page edits. Again emphasizing the hypothesis of deepening prolific engagement, those who were invited and played were 2x more likely to make 20+ talk page edits, and those who completed mission 7 were 4x as likely to make 20+ talk page edits. It's worth keeping in mind that the median number of talk page edits for all groups was 0.

Comparison Average edits No edits 220+ edits Median edits
Players vs. non-invited control .88x .89x 1.2x 0.00
Players vs. invited non-players 1.7x .91x 2.0x 0.00
TWA finishers vs. TWA starterss 3.0x 4.0x 0.00


Number of articles[edit]

Group Average # articles 10+ articles 10+ % Median # articles
Control A, non-invited 2.83 4.00 2.60% 1.00
Control B, invited non-players 2.14 6.00 3.87% 1.00
TWA players 5.38 20.00 12.35% 1.00


The number of different articles edited showed a pattern of increased variety in editing targets by players of the game. Those who played edited 1.6 to 2.6 times the number of different articles, on average. Editors who played were also 3 times as likely to edit 10+ articles. Again, it's worth noting that the median number of articles edited for all groups was low, only one.

Comparison Average # articles 10+ articles Median # articles
Players vs. non-invited control 1.6x 3.3x 1.0x
Players vs invited-non-players 2.6x 2.7x 1.0x
TWA finishers vs. TWA starters 2.3x 2.9x 1.0x



Qualitative analysis[edit]

"I enjoyed the idea of editing a fake article for practice - in fact, when I first saw the game, I immediately hoped it would incorporate some sort of actual editing rather than just theory or questions or something."

We surveyed the 600 editors who at least made it to the first stage of mission 1. We sent these editors talk page invitations to a Qualtrics survey using EdwardsBot. 42 editors responded between December 23rd and January 4th.

Overall[edit]

  • 87% were satisfied or very satisfied overall, 6% dissatisfied or very dissatisfied
  • 89% said 'TWA made me more confident as an editor', 3% disagreed
  • 89% said 'TWA helped me understand Wikipedia better', 3% disagreed
  • 77% said 'TWA made me want to edit more', 6% disagreed
  • 79% said 'TWA made me feel welcomed and supported', 3% disagreed
  • 71% said, 'TWA helped me know what to do next', 9% disagreed
  • 80% said, 'TWA prepared me to be a successful contributor to Wikipedia', 6% disagreed
  • 75% said, 'I enjoyed playing it', 6% disagreed
  • 89% said, 'The game is a good way to introduce new editors to Wikipedia', 3% disagreed


How satisfied were you with The Wikipedia Adventure?
Overall, how satisfied were you with The Wikipedia Adventure?
Do you agree or disagree with these statements about The Wikipedia Adventure?
Feedback statements about The Wikipedia Adventure


Educational effectiveness[edit]

"TWA was very informative and helped pull back the curtain on some of the fundamentals of editing."
  • 92% thought the educational aspects were useful or very useful; 3% useless or very useless


How satisfied were you with the educational aspects of The Wikipedia Adventure?
Overall, how effective was The Wikipedia Adventure as an educational tool?
How effective were these specific educational aspects of The Wikipedia Adventure?
How effective was the educational content in The Wikipedia Adventure?


Design satisfaction[edit]

"I've seen and heard companies, including my own, talk about learning through 'gamification'. I found TWA to be the best example of gamification I have witnessed to date."
  • 83% were satisfied or very satisfied with the design, 6% dissatisfied or very dissatisfied
  • 76% said the gamification elements were effective or very effective, 6% ineffective or very ineffective
  • 70% liked the design as it was and did not want it to have a more 'serious' design, 14% wished it was more serious

The hypothesis we set out to test was that play could be thoughtful and fun could yield meaningful experience and education. The survey data supports this conclusion.

We also aimed for a target demographic of college-aged men and women. The most common given age group for appropriateness was that demographic, so it looks like we aimed right. It's also worth noting that the bell curve was fairly 'thick' around this demographic, and survey respondents thoughts TWA would be appropriate for many age ranges, especially those 13-29 (but also younger those than 13 and 55+).

How satisfied were you with the design of The Wikipedia Adventure?
How effective was the design of The Wikipedia Adventure?
How effective were the gamification elements of The Wikipedia Adventure?
How effective were the gamification elements of The Wikipedia Adventure?
What age demographic would The Wikipedia Adventure be appropriate and effective for?
What age demographic was The Wikipedia Adventure suited for?
Would you have preferred if The Wikipedia Adventure had a more 'serious' tone and design?
Should The Wikipedia Adventure have a more 'serious' design?


Player demographics[edit]

  • Respondents came from a number of countries: Australia, Bangladesh, Brazil, Canada, Estonia, Hong Kong, India, Ireland, Nigeria, Portugal, Singapore, Sweden, Macedonia, US, and UK. Although globally diverse, the majority of players still came from US/UK.
  • Respondents matched discouraging gender gap percentages. 11% were women. In alpha testing, some editors suggested that the game was 'male' or 'geeky'. We never thought space and galactic carnival fireworks were particularly gendered, and the 4 survey responses by females supported that--all rated the game and its design 3 out of 5 or above, mainly 4 out of 5.
  • 1 editor who was very dissatisfied with the game had 100,000+ edits. This is too small a sample to draw any conclusions, but it nonetheless reinforces the notion that TWA is not for everyone. Indeed, it is Specifically for new editors. New editors may well have different needs than experienced editors and we need to be mindful that what a 100,000+ editor finds 'insulting and imbecilic', may be just what a new contributor needs to embark down the path towards becoming a prolific contributor.
  • 94% of survey respondents had about 100 or fewer edits, suggesting our sample was not biased away from the target demographic of new editors.


How old were the players of The Wikipedia Adventure?
How old were players of The Wikipedia Adventure?
How many edits had players of The Wikipedia Adventure made when they took the survey?
How many edits had players of The Wikipedia Adventure made?


Possibilities for expansion[edit]

"I think TWA at the moment is a great stepping stone for new users such as myself. I would love to see it expand to include more 'advanced' topics that can be optionally covered by the user."
  • 64% said, 'I wish there was more of it', 11% disagreed
  • 89% said, 'Lots of new editors should be invited to play TWA', 3% disagreed


Would you be interested in playing more levels of The Wikipedia Adventure?
Would players be interested in more levels of The Wikipedia Adventure



In their own words[edit]

What they liked[edit]

Editors found TWA interactive, educational, friendly and usable, gamified, uniquely and well-designed, reassuring...
"As always with new software, you need to be walked through the features: the game worked well in this context and was useful. Covering all (most) areas is the best bit."

"The interactiveness of The Wikipedia Adventure was an easier and better way to learn the basics of Wikipedia versus trying to run around to different pages and just reading about it." "Well done basic introduction to Wikipedia." "Simple, easy to use."

"It's simple for new Wikipedia editors like me trying to learn the basics of Wikipedia."

"Informative and fun"

"The conversational tone is pretty good, It makes it fun even if it's all pretty simple."

"Gave a very good, brief introduction to editing."

"It made stuff easy to understand."

"Fun and intuitive game." "I completed the entire game because it wasn't as dry as other training tools out there."

"Really enjoyed the entire editing process. A lot of thought was given to it, including the fictional users who guide you through the process. It all flowed very well and was highly educational. I also thought that awarding badges was a nice touch too."

"It was all beautifully designed. I enjoyed aspects such as the challenges and badges that made it feel more like an educational tool or game rather than a lecture, and recorded your achievement to date."

"It was a nice length and about right level of seriousness for me." "I liked it, very different."

"Designed great, easy to use because of that."

"I think the Adventure should be kept just as playful. It's definitely not serious, but that's not a bad thing. Maybe it might be off-putting for someone who thinks it's too "cheesy" or doesn't like its tone, but I certainly enjoyed it and I'd bet a lot of other people would, too. "It really left me feeling prepared to make future edits."


What they didn't like[edit]

A minority of players found the game too silly for their taste, more appropriate for youth, or just plain insulting
"It was kind of cheesy but kept my attention."

"A little too silly for my liking, but it's probably great for young editors."

"Too long-winded and geeky"

"The badges are kinda neutral but the whole thing works very well."

"The forced badges I had to edit out of my talk page were a bit annoying."

"I disliked the way that participants are addressed, as if imbeciles."

"I dislike it because it's like a kid's comic book - first impressions are everything - and I did not like it from the point the big alieny picture arrived."

"I wouldn't disagree with maybe a separate, more formal introductory "page"."

"Maybe split into young-adult and adult streams?"

"Personal I think if you replace the space unicorn with a crashed rocket it would be good for anyone with a sense of fun."

"Even though it felt somewhat silly playing game, it was a good learning experience. It should be serious and have more techniques and points for serious editors."

"Where's the 0-2 category - that is the appropriate level for this stuff - it's at the level of Teletubbies"

"Maybe age specific versions? Space for kids, a fictional vampire wiki for teens, university class stuff for students, etc, etc? "I still don't know what the blue guide creature is."

"I wish it gave the impression that editors were expected to be mature and intelligent, rather than idiots who could be entertained an educated with this kind of drivel."

I'd like to note that the most negative feedback consistently came from one respondent who had 100,000+ edits. While I do not discount their points--echoed by earlier design debates about the game's playful or even youthful nature--it needs repeating that the target for the game is new editors, and these contributors are different and have different needs than experienced contributors.

What they wanted more of[edit]

Many asked for advanced levels that focus on more specific and sophisticated skills
"I wish there would have been a misisons #8, #9 and #10 involving the use of templates, inserting into edits and how to learn and distinguish when to use which. Other elements that would be handy, maybe in an advance mission, would be: tables, logic, advanced programming, advanced templates and formatting."

"May be worth including some information about the policy regarding not editing on behalf of an organisation you belong to. As well as a some additional missions covering what notable enough to be included in Wikipedia."

"Maybe it could extend to more complex rules - when I signed up, I found surprisingly few links to policies or guidelines. For instance, one thing that could be included would be something about red links: I was very surprised to find out they are not only allowed but encouraged, and only found that out at all when someone reverted one of my edits."

"Maybe it should mention how to find sources."

"I think TWA at the moment is a great steeping stone for new users such as myself. I would love to see it expand to include more 'advanced' topics that can be optionally covered by the user. I think these topics should definitely cover how to code mathematical expressions, how to find proper references externally and cite them, more detail on how to structure/format Wikipedia articles"

"I would love more advanced missions. I can't help but feel that, as a beginning editor, my work is barely tolerable and likely filled with flaws or missing elements which could make it better. I know looking at other articles, especially those highlighted on the main page, gives me ideas and allows me to see examples of good work but the mission was an excellent jump start"

"As I said before please cover what is notable enough to be included as an article to Wikipedia. Might be worth covering whether photos on things like facebook and instagram are considered free to be included in Wikipedia, i.e. copyright issues. Optional advance information in regards to structuring certain types of articles i.e. TV, music/dance groups, films, etc."

"Formats of different articles and how to raise doubts/ask for references on other articles."

"Adding images to the summary of an article and the understanding of the ideal layout for an article."

"How to add photos and how to best interact with other editors when there is a dispute about content. Maybe the more inner workings of Wikipedia too - for example, how and why do some editors have more authority over content than others."

"Include ways to help practice basics, because many things tend to be washed away without practice."

"Eh? Maybe talk page debate basics, avoiding straw men and the like."


I want to call out one specific suggestion:
"I would like there to be a mentor aspect to Wikipedia. Sometimes I find that I'm not sure of the best way to edit something - it would be great to be assigned a mentor once you complete the Wikipedia Adventure that you could bounce ideas off of and who could give you ideas for pages to edit."
I can't help but note that this winter Jackson Peebles reimagining Wikipedia Mentorship grant proposal would likely have been funded, were it not for his passing. This is an incredibly fruitful area for future improvement and we should not pass up the opportunity to reinvent it.

Progress towards stated goals and targets[edit]

Planned measure of success (target) Actual outcome
Achieved
Explanation:

The script was usability tested in 10 needfinding interviews and alpha-tested with nearly 100 full game testers

Achieved
Explanation:

Survey feedback on qualitative game aspects delivered to 600 editors with 6.7% response rate

Almost all achieved
Explanation:

We invited 10,000 editors to play and created 2 control groups to compare against. We collected edit activity data. Data on retention is still pending as we didn't have long enough after the beta-test to collect medium-term statistics. These will be done in the next 1-2 months.



Strategic impact[edit]

  • We increased PARTICIPATION by increasing the average number of edits made by 1.2, 1.9, and 3.2 times the size. The sizeable growth of editors who made 20+ edits suggests TWA may be an ideal tool to speed up the onboarding of future prolific contributors.



Key Learnings[edit]

What worked well[edit]

  • It was invaluable to ground the work of this project in respected literature on the topic of games, new forms of narrative fiction, and motivation. Pairing this theoretical background with hands-on interviews with new editors helped ground the game in practical knowledge.
  • The game was scoped too large at first and benefited from very significant cuts and editing. The decision to cut down the game from 12 to 7 levels made the project manageable and ensured a scope that could be both built and played in a reasonable timeframe. The best decision made, upon a recommendation from Siko Bouterse, was to prototype the entire revised script in an interactive text program called Twine. This allowed usability testing and major script revision before having to deal with the technical complexities of javascript in Guided Tours.
  • Reaching out for help from experts has been encouraging. Speaking with folks from the gamification industry gave broader context to the purpose of gamification, as well as its pitfalls. Indeed, one of the most important learnings was to be very careful with gamification, to ensure that it always enhances intrinsic motivations of users and never trivializes or skews them.
  • Working directy with experienced designers made the creative brainstorming and implementation easy; it was well worth the expenditure to hire someone who was talented and able to communicate and iterate on designs.
  • Finding technical folks who can code, even paid contractors, who are also familiar with mediawiki is hard. Tapping into the WMF tech community liaisons and grant staff so far much more practical than trying to use solely volunteers. It appears that if you need something technical done on any reliable timeframe, you should be prepared to pay someone to do it.
  • Lessons from experts in coding permitted complete aspects of the project in hours that otherwise would have taken days or weeks. The group of talented people that has helped or consulted on this project has made it far better than it would have been if approached alone.
  • Alpha testers can be very fruitfully motivated with an expressed intent to acknowledge their help and give barnstars. Quick responses and thorough bug-tracking makes testers feel their efforts are impactful and well-heard.
  • Before making a quantitative plan, it's essential to craft a comparable control group against which you can evaluate impact. Without this, any gains you make are meaningless to interpret since you don't know 'compared to what?'
  • Surveys need to be short enough to be completed, but thorough enough to specifically answer the kinds of questions you want input on. It's great to have logic-based follow-up questions to elicit free text quotes that illustrate the overall numbers.


What didn't work well?[edit]

  • The project has proceeded smoothly; the main obstacle has simply been working on so many different pieces at the same time. While one mission is being built, another is being refined, and bugs in code are discovered along the way, needing replacement throughout other levels.
  • While Guided Tours is an excellent feature, it does have the strict constraints of Javascript's coding language and can be 'broken' by any variety of minor errors.
  • Time-management is a puzzle. You both get more done than you think is possible, but it all takes longer than expected. Problems arise that you didn't foresee, both with creative and technical issues, and they don't resolve predictably. Keys to this have been to always keep working on something, even if it's not the exact piece you need. And if you're stuck on something for more than a few days, ask someone who knows more about it than you for help.
  • Working with a creative designer takes a little bit of feeling out at first. The best advice received early on was to not assume creative people can read your mind, and to clearly but kindly point out what you want, what you like, what you don't like as much, and what you need done by approximately when.
  • Making internally set deadlines doesn't always work out as planned. But deadlines have proved helpful even if they're flexible, and you miss them sometimes. They at least keep you tethered to a progressing timeline.
  • Building a real thing is more like a painting than an assembly line. You have a sketch. You erase and redraw. You lay down base colors. You get perspective, and then go back in and reshape. You add details to different parts of the canvas at different times. It comes together in series as much as in sequence and the sense that the final destination is reachable doesn't really come together until the majority of the work is already finished.
  • A game like this is never actually finished. To this day there are about 15 lingering bug reports, most very low priority. There are new interface components to update the text for. There is deeper data analysis that can be done. There's the possibility for both expansion of the game, or alternate versions of it using a different tone and theme. This is a good thing; nothing worth doing is every really finished.



Next steps and opportunities[edit]

Higher priority
  • Conduct round 2 statistical analysis. Move to formal statistics, calculate p values and significance. Include a longer timeframe. Calculate editor retention. Conduct a revert analysis to see if player contributions are more/less lasting. Run algorithmic tests on players to quantify quality of contributions. (Will be completed by Spring 2014)
  • Work with the community to build consensus and develop a strategy to promote TWA to more new editors.
  • Build a kit to assist in translating TWA into other languages.
Lower priority
  • Develop an alternate version of TWA geared towards an audience of global academic professionals.
  • Develop the pipeline between Getting Started, TWA, Teahouse, and a reimagined mentorship project.
  • Update the script for visual editor when deployed (and later for FLOW)
  • Write up a scientific research paper and submit it to a Computer-Human-Interaction conference
  • Present findings from TWA at game and gamification conferences
  • Promote the game on social media and in relevant press outlets.
  • Incorporate minor tweaks and suggestions into script.
  • Build additional levels per suggested topics
  • Incorporate TWA learnings into the Learning Patterns Library



Project resources[edit]



Part 2: The Grant[edit]

Finances[edit]

Actual spending[edit]

Expense Approved amount Actual funds spent Difference
travel to attend GSummit conference $1400 $439.80 on flight, $395 on registration - $565.2
expert consultation (Nischay's API code) $400 $400 0
books and literature $200 $208.37 + $8.37
user interaction and graphic design contracting $3000 $3000 0
project management and implementation by project lead $6667 (increased from original $5000, per extension request) $6667 0
Total $11667 $11110.17 - $556.83

Remaining funds[edit]

Do you have any unspent funds from the grant?

Please answer yes or no. If yes, list the amount you did not use and explain why.

  • Yes, $556.83, because travel to and registration for the gamification conference was discounted.

Documentation[edit]

Did you send documentation of all expenses paid with grant funds to grantsadmin at wikimedia.org, according to the guidelines here?

Please answer yes or no. If no, include an explanation.

  • Yes


Confirmation of project status[edit]

Did you comply with the requirements specified by WMF in the grant agreement?

Please answer yes or no.

  • Yes

Is your project completed?

Please answer yes or no.

  • I completed all I set out to do in this grant. There are still areas for growth and continued development that may extend beyond this grant.



Grantee reflection[edit]

I'm just impressed by how many hands were needed to make this come together. I pretty much just want to say thanks for making something possible that I really put a lot into: Credits and thanks.