Research talk:Using information sources to drive contributions to Wikipedia

From Meta, a Wikimedia project coordination wiki

More info[edit]

I'll list out some questions and requests below. Please feel free to respond inline or in a separate comment. This looks like an exciting project! --EpochFail (talk) 18:24, 22 June 2012 (UTC)[reply]

Thanks, EpochFail! Responses below, and requests responses have been answered via updates to our project page. WorldsApart (talk) 21:24, 22 June 2012 (UTC)[reply]

Questions[edit]

  • What sort of comparisons do you hope to make from your three randomly sampled groups of users (all, last 30 days, recent)?
    • We chose these three groups as a way of trying to get users from across the spectrum --- frankly, we were inspired by the technique described here. We're interested in comparing all of our metrics across these groups, at least to the extent we have statistical power to do so, but honestly, we're mostly just interested in getting a good sample of engaged Wikipedia editors to work with. We suspect our tool will best fit people who are interested already in doing editing; it's not necessarily a newbie recruitment tool. We're certainly open to recommendations and suggestions about sampling techniques!
  • You're likely to get less responses from the "all" group since many editors will have ceased activity years ago. How do you plan to deal with this?
    • Yep, you're right. If we don't get enough, we'll sample more. We are starting off with a larger initial pool for the "all" group than for the other groups.
  • How many users will you need for each sample?
    • Obviously, the more the better -- but we'd like to see at least 50-60 editors total.
  • Will your analysis of the tool's usefulness suffer from editors discontinuing their use of the tool in the short term?
    • We don't think so. We believe our tool will actually be fun and motivating, and that's part of what we're hoping to find. If they stop using it, we'll be sad, but that's a real (and negative) result. Even if people hate us and stop using the tool before too long, if we measure a real effect from that short usage, that's still something.


Requests[edit]

  • Please post the consent form or link to it from your project page.
  • Please post an example recruitment message.

Question[edit]

This is a deeply interesting subject from the perspective of editor engagement. A few questions:

Thanks, Steven. To answer your questions:
  • Your proposal does not specify how recommendations of sources are delivered to editors. Is it via the tool itself? Talk page? Email?
    • The recommendations are delivered via the tool itself.
  • I would be interested to know if this tool is FOSS and what the likelihood it could be integrated into MediaWiki as an extension or gadget?
    • The code will be FOSS. As it stands now, it is designed to run on a separately hosted webserver. I love the thought of somehow integrating it with MediaWiki. The code won't ever run standalone or "in browser," though; there are some significant database queries and computationally intense algorithms that need to be done to serve up the recommendations. The separate computing server can't be factored out, but perhaps MediaWiki could somehow be extended to integrate with it. I'll admit that I'm not familiar enough with MediaWiki development to say whether that sort of architecture is feasible. Could a MediaWiki extension or gadget integrate heavily with a central server doing the heavy lifting?

Thanks, Steven Walling (WMF) • talk 23:54, 27 June 2012 (UTC)[reply]

You're welcome. WorldsApart (talk) 01:28, 28 June 2012 (UTC)[reply]

Queries by WSC[edit]

Hi, nice project, but I have a few suggestions.

Thanks -- responses are below.
  • Some Wikipedians are dead, some accounts are indefinitely blocked, other accounts are long retired or are bot accounts or merely doppelganger accounts created to prevent others impersonating people. None of these would be appropriate to include in your samples.
    • Agreed. We'll make sure to leave these categories out of our samples.
  • Most accounts created on the English Wikipedia have zero edits. There are many reasons why an account may have zero edits on a project - for example if I visit another language version of Wikipedia whilst I'm logged in my account automatically registers me there. So some of our zero edit accounts will be non-english speaking Wikipedians who've just checked an English page to see what pictures are used. The zero edit accounts are such a large group that if you do test them they will swamp the 1-9 edit group, so my suggestion would be to have zero edits as a group in its own right and ideally screen out those who have >0 edits on other wikimedia projects.
    • Good point. We should just keep this simple and not include editors who have zero edits.
  • One key divide amongst editors is between those who add referenced content and those who don't. There are some editors with very high edit counts who only fix typos, revert vandalism or tag articles for deletion. If in your testing you can identify those who do and don't add referenced material and test them separately it would give some very interesting results. I'm particularly keen on testing whether this sort of approach could be used to divert some of our "template bombers" into article improvement.
    • This is a fabulous idea! When measuring the impact of our tool, we should assess how frequently each editor adds citations to articles, and how that may be changed by the use of our tool.
      • I would be fascinated to see the results of that, but a control group is essential here. I'm pretty sure that some of our editors learn to do citations whilst they are editors. Some because they reach that stage at school and some because of experiences on this site. So what you need to measure is the increase compared to your control group. WereSpielChequers (talk) 08:56, 30 June 2012 (UTC)[reply]
  • Be aware that the act of measuring something can alter what you are measuring. This scheme involves contacting a group of Wikipedians, for currently active ones that might not have a huge effect. But for formerly active Wikipedians the effect can be significant, you only have to look at how the number of "active admins" which had been rapidly falling almost stabilised after we introduced desysopping for 12 months inactivity. The experience there has been that most admins who have not edited in 12 months will do some edits if prompted and threatened with a desysop. My suggestion would be to also monitor control groups that you don't contact or that you contact with a neutral query.
    • We completely agree: merely recruiting someone for the study can have an effect, as can just contacting them on their talk pages. The neutral query is a good idea, and we should do it. We are already planning to have three versions of our tool: one of them is the full tool as we believe it should work, and there are two other versions with reduced functionality. These other two will also act as a form of a control.
  • Can we have a little more detail as to what sort of sources you are going to be offering? You might want to discuss that aspect at en:Wikipedia:Reliable sources/Noticeboard.
    • This is entirely up to the participants. They will enter into the system RSS and/or Twitter feeds of sources of information that they think are interesting to them, and potentially worthwhile. We'll use those RSS entries or Twitter tweets to make recommendations to the editor as to how to contribute based on the content within. Our recommendations are not intended to indicate reliability: that decision still lies with the contributor and the community editing the article at hand.
  • The prize incentive is possibly superfluous and a little risky - remember Wikipedians are primarily altruists and have a well honed distrust of anything that looks like spam on Wikipedia. If you are going to do it don't mention the brandname in your posts to people's talkpages - a book token worth fifty US dollars would be a "less promotional" description. Have the Amazon name in the FAQ page that you link to your post "The book token for fifty US dollars will be delivered as an Amazon voucher". Describing them as US dollars should spare you the usual queries from Australia etc.
    • We can easily remove the reference to Amazon, and will do so. The example project linked to from the subject recruitment page does this, so we possibly mistakenly interpreted it as a best practice.
      • I don't have a problem with it going on the page that you are directing people to. My concern is with the note that you put on people's talkpages. I'd have a similar issue with mentions of the institution you are part of, fine on the landing page, OK on an individual note to an editor, but not appropriate in a template being added to dozens, perhaps hundreds of talkpages. WereSpielChequers (talk) 08:56, 30 June 2012 (UTC)[reply]

Hope that helps WereSpielChequers (talk) 10:41, 28 June 2012 (UTC)[reply]

It does. Thank you! WorldsApart (talk) 11:48, 29 June 2012 (UTC)[reply]
You're welcome, I look forward to seeing the results of this. WereSpielChequers (talk) 08:56, 30 June 2012 (UTC)[reply]

Straw poll: Should this review be closed?[edit]

Hey guys. Since discussion has calmed down, I'd like to take a straw poll to see if we can close the review. Please post your support or concerns below. --EpochFail (talk) 15:17, 2 July 2012 (UTC)[reply]

  • Support: This experiment is essentially like the development of a new toolserver script, gadget, bot, etc. in that it represents a new tool for Wikipedians. This project is well documented, unlikely to be disruptive and likely to turn into a useful tool for Wikipedians. --EpochFail (talk) 15:17, 2 July 2012 (UTC)[reply]
  • Support per the responses to my queries in the above section. Also I have a good feel about this one. WereSpielChequers (talk) 16:15, 2 July 2012 (UTC)[reply]

Given the non-controversial nature of this project, I'm closing the review despite the small amount of participation in the straw poll. I advise User:WorldsApart to continue with his recruitment plans. --EpochFail (talk) 13:50, 9 July 2012 (UTC)[reply]

Please have a look at my edit here. I can see many people creating robots using a very simple API that add information from other sources to Wikipedia without having to bother about parsing and carefully changing existing articles. Technically this could be achieved using templates (template:stockvalues/google containing just "40") but that would be very cumbersome. And I think data from external sources should have a type, so you know that 40 means $40, at Nasdaq. Currently, as far as I know, no application is using Wikipedia to look up the boiling point of water because the fact is not listed as a fact. Wolfram needs its own editors to create their own database to store facts like that. Having a "Facts" page next to "Article" pages would allow for a huge amount of currently hidden data to be put easily on Wikipedia with many different APIs, to be browsed by humans with a simple interface, and to be used easily in other software (showing timelines, comparing boiling points, showing a button "other 18th century composers" in a documentary, graphs for stocks related to gold, etc). It would be great if Wikipedia would be at the center of this mass data gathering, doing it all the Wikimedia way (ie simple pages, and anyone can edit). Joepnl (talk) 01:10, 25 July 2012 (UTC)[reply]

We're not currently working on the Wikidata project, but we're certainly supportive of the concept. WorldsApart (talk) 14:21, 25 July 2012 (UTC)[reply]