Research talk:How role-specific rewards influence Wikipedia editors’ contribution

From Meta, a Wikimedia project coordination wiki

Why remove admins and people who previously received a barnstar?[edit]

I'm not sure I see that explained. --Halfak (WMF) (talk)

@Halfak (WMF) Good point. Here, we want to only focus on the effect of receiving the first reward. In this way, we can control for other factors and also make our study comparable to prior work [1][2] that tested the effect of receiving the first barnstar. --Diyiy (talk)

It matters who gives you a barnstar[edit]

It looks like Diyiy will post the barnstar messages. When I receive a barnstar, I will certainly check out who awarded me the barnstar when interpreting the meaning to me. How do you expect the person awarding the barnstar to affect the results? Maybe people won't take Diyiy's barnstars seriously. --EpochFail (talk) 16:04, 3 December 2018 (UTC)Reply[reply]

@Halfak (WMF) Yes, we won't post the messages using the account of Diyiy (talk). As we discussed over emails, we would like to collaborate with some highly experienced Wikipedia editors and ask them to help us post the barnstar messages. If you could point us to some editors or ways to recruit editors, that will be great! We're also considering posting the messages using Robert Kraut's account. --Diyiy (talk)

Giving barnstars to editors who deserve them[edit]

How will you ensure that the editors in fact deserve the barnstars you would award them? Can you produce a pilot sample of editors who might be identified for such rewards? How will you avoid vandals and other types of problematic editors? --EpochFail (talk) 16:05, 3 December 2018 (UTC)Reply[reply]

@Halfak (WMF) We will only give the barnstars to editors who are among the top %N (N could be 1, 5) most productive editors in the last month. We have a list of editors who are candidates of such rewards - will share this with you soon. We selected editors not only based on their number of edits but also based on their type of edits. We used the semantic edit intention taxonomy [3] to predict the semantic intention of editors' every edit. If there are any edits predicted as vandalism, we removed such users from our candidate list. We'll also double check whether an editor has any vandalism edits in the last month using Wikipedia Vandalism API. --Diyiy (talk)
I think the main problem of this proposal is/was that it would basically generate a number of "fake", insincere or false barnstars (cf comment by Bus stop). Although users give barnstars without any consistency, to emulate a real-life barnstar you would need to "fake" it really well, e.g. find a suitable very experienced editor A, an editor B who trusts or admires A, a specific merit of B that A is considered qualified to judge, and a reasoning that describes such merit so that you have a suitable text to deliver.
You cannot just generate such awards manually, but you'd have to put enough tailor-made effort in each of them, let's say half an hour of experienced wikipedian time for each. Only then they will be considered personal and the recipient will attach them an emotional value. In the worst case, massified barnstars will just be perceived as spam: we have plenty of vandals who send random barnstars or "wikilove" messages with no comprehensible purpose other than trolling or other disruption.
All in all, the only way I could see this working is that you make a list of users whose editing pattern is very similar to that of other recipients of a certain barnstar, and then you privately send such data/suggestions to some users who have previously often granted that same barnstar. But then you will have no control, because the users not chosen for a barnstar maybe just aren't as deserving of it as the others.
Alternatively, it's possible to give awards based on objective criteria. Then if you arbitrarily exclude some from the list of recipient you can see whether that has an effect on them. Such arbitrary granting can only be performed by a user who is not perceived to be in any official or pseudo-official capacity, otherwise it will be felt as injustice from the "system" and backfire. There's a risk that the "control" would not be a control at all, because they would be affected by the fact of being arbitrarily excluded.
Some years ago I sent almost 40 Rosetta barnstars to the most active Meta-Wiki translators, with objective criteria and some manual polish based on the fact that I knew most of them. I would be interested in an analysis of any effect this might have had, e.g. compared to editors who were very similar (e.g. just below my activity threshold) but did not receive the barnstar. Of course that's a semi-massified barnstar so I expect it was relatively devalued by the recipients. Nemo 20:13, 3 January 2019 (UTC)Reply[reply]
I now see that my comment is almost completely redundant with what was written already years ago. :) If you want to replicate or build on that experiment in a project other than the English Wikipedia, I think on the Italian Wikipedia there would be some interested people. Nemo 21:06, 3 January 2019 (UTC)Reply[reply]

800 barnstar awards could be disruptive[edit]

How will you ensure that your awards don't cause a big event? Can you spread out the posting of the awards over time? E.g. only post 50 per day for ~ a week? --EpochFail (talk) 16:07, 3 December 2018 (UTC)Reply[reply]

@Halfak (WMF) This is a great suggestion - We will post around 50 per day for one or two weeks for this barnstar experiment. --Diyiy (talk)


  1. Restivo, Michael, and Arnout van de Rijt. "No praise without effort: experimental evidence on how rewards affect Wikipedia's contributor community." Information, Communication & Society 17, no. 4 (2014): 451-462.
  2. Restivo, Michael, and Arnout Van De Rijt. "Experimental study of informal rewards in peer productin." PloS one 7, no. 3 (2012): e34358.
  3. Yang, Diyi, Aaron Halfaker, Robert Kraut, and Eduard Hovy. "Identifying semantic edit intentions from revisions in wikipedia." In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2000-2010. 2017.

These are two hypotheses with different premises[edit]

There are two studies that can be done here:

  • You can analyse barnstars (and other awards) being handed out around the project and study their correlation with editor retention.
  • You can also give out new barnstars. However, as many have already pointed out on this page and the related en.wp discussion, this will be an artificial situation, because the recipients will be aware that they are part of an experiment.

You might want to look out for the w:Lucas critique of interventions. Deryck C. 18:59, 20 December 2018 (UTC)Reply[reply]

@Deryck Chan: We plan to conduct both a correlational study (your first suggestion) and an experiment study (by giving new barnstars). When giving barnstars to editors, we will not tell people that they are part of an experiment. Since we have posted this proposal with Wiki community, we did include one exclusion criteria - remove people who have been involved in any discussions around this proposed work. -Diyiy (talk) 02:24, 29 December 2018 (UTC)Reply[reply]
@Diyiy: we will not tell people that they are part of an experiment.
I am glad to see that at least on this occasion, you specifically acknowledge that you are conducting a social experiment, rather than hiding behind the broader and more cuddly term of "research".
I am also pleased to see that you acknowledge that you plan is based on deceit Wikipedia editors, although you do not use that word.
I posted[1] at length about this on en.wp: here.
So I invite you to relabel your proposal, placing the words "social experiment" in the heading, and explicitly acknowledging in the lede both a) the lack of consent, and b) your sponsorship by Facebook. Your failure to prominently disclose that sponsorship is a very big breach of the principle of transparency, and I hope that you will remedy it promptly. --BrownHairedGirl (talk) 16:37, 30 December 2018 (UTC)Reply[reply]

I think you're mis-defining copy-editing[edit]

Your "copy-editing" barnstar is going to say "the majority of them were copy editing edits fixing errors". That's only a tiny part of copy-editing Wikipedia articles. Speaking as someone who really got her start on Wikipedia as a copy editor (including copy-editing a lot of articles during or in preparation for featured or good article candidacies), the core of copy-editing is in things like:

  • improving readability
  • improving word selection
  • arranging the article to flow most effectively
  • considering the most beneficial use of reference sources
  • improving formatting and layout
  • ensuring that best practices are followed in accord with WP:MOS and other editorial policies and guidelines

It's not just fixing the typos and grammar mistakes - although most copy editors will start off there. But I have a feeling you're going to struggle to find enough editors meeting your criteria to come up with statistically significant results. Risker (talk) 01:55, 21 December 2018 (UTC)Reply[reply]

@Risker: thanks for sharing this comprehensive definition for copy-editing. We propose to use "fixing the typos and grammar mistakes, improving the tone or punctuation" as a narrow definition for "copy-editing" for now. These are a subset of tasks that a copy-editor will do, but also relatively more characteristic tasks that copy editors do. Also, our proposed algorithms can predict fixing errors or improving punctuation with relatively higher accuracies, however, detecting edits like ensuring the best practices and guaranteeing the editorial policies are relatively challenging for current machine learning algorithms because of limited training corpus and dependence on external policies. --Diyiy (talk) 02:35, 29 December 2018 (UTC)Reply[reply]
Wow, way to miss the point. The copy-editor's barnstar is for copy-editing, not for minor edits like typo and grammar fixes. I *have* copy-editing barnstars. By handing out the same level of recognition to someone who makes a few minor edits, what impact will you have on those who worked hard to *earn* the respect of their peers by long hours of hard work? I personally find it offensive that you're equating minor edits with copy-editing; I know that probably makes me appear to be an elitist. But when Wikipedians have historically had to do something pretty significant in order to be granted a barnstar by a peer, why are you changing that expectation, and handing out a barnstar for trivial activity? Barnstars aren't participation trophies, and by abusing them in this way, you deprecate their use as recognition of important accomplishments. Barnstars are essentially the highest level of interpersonal recognition English Wikipedians give each other - the first level being the "thanks" button" and the next level being "Wikilove", with personal talk page messages somewhere in the mix. How about you use one of the lower levels of recognition in your experiment, instead of mis-using barnstars? Risker (talk) 00:05, 2 January 2019 (UTC)Reply[reply]


Many contributors already take risks to be involved in the projects, others use anonymous names for a reason. What mechanism are there for people to opt out the tracking of their activities. What mechanisms are in place to ensure the safety of editors as well ensuring that editors aren't outed by the experiment. What protections and support outside of Wikipedia will be provided should an editor be outed and pursued. Gnangarra (talk) 09:58, 1 January 2019 (UTC)Reply[reply]

Validation of problematic editors[edit]

Previously banned editors have used sockpuppets to try to return to Wikipedia, as part of that process they do lots of small seemingly constructive edits before indulging on returning to areas which saw them banned. How will this ensure that rewarding active editors wont be encouraging and validating such activities. Also how will Wikipedia admins be able to identify those awarding the barnstars, which is also another means previously used by sockpuppets to be seen as a positive community actively working to improve the projects. Gnangarra (talk) 10:04, 1 January 2019 (UTC)Reply[reply]

Outcomes & Addictions[edit]

What actions will this experiment put in place to ensure that participants negatively influenced by the project will be supported. By negatively influenced I mean either driven away in fear of outing or driven to addiction. This indicates that Facebook supports and undertakes research to foster addictions given that on en.wikipedia village pump it was revealed that you receive funding from Facebook what assurance can you give that your work will not be used to cause harm and foster addictions. Gnangarra (talk) 10:09, 1 January 2019 (UTC)Reply[reply]