Jump to content

Grants:IdeaLab/A citation a day

From Meta, a Wikimedia project coordination wiki
status: idea
project:
A citation a day
idea creator:
project contact:
sjgknight
participants:
summary:
Use template tags in articles to a) resource a literacy game, including the ability to find citations, and b) contribute answers back to the encyclopaedia
created on: 09:27, 13 May 2014

Project idea

[edit]

What is the problem you're trying to solve?

[edit]

People are generally rather poor at finding high quality sources, and evaluating the credibility of information. Literacy is a core barrier to access to educational materials, including Wikipedia. Literacy should be understood to include: abilities to extract claims from texts; to integrate multiple texts; to evaluate the credibility of information, and the quality of sources, etc.

In addition, Wikipedia has many instances of 'citation need' (and the relate templates such as 'fact', etc. see 'solution'), in some cases these have been inappropriately applied (for whatever reason), in any case the tags should ideally be 'resolved' (addressed, removed, or removed in tandem with their target claims).

What is your solution?

[edit]

Building off the idea at Grants:IdeaLab/Wikipedia_Quiz

The idea is close to something like “a google a day” but would involve a more sophisticated quiz function (and would also contribute to Wikipedia). Whereas ‘a google a day’ is based on queries for claims with known answers, something interesting could be done with claims which are lacking evidence in Wikipedia (‘citation needed’ and the related templates). So the model would be:

  1. Extract claims with ‘citation needed’ template https://en.wikipedia.org/wiki/Template:Citation_needed#Template_data used against them (I think this would need some python extraction using punctuation, the templates are used inline after the claims they refer to, so extraction would involve a span from template backwards to the next full stop/period. Python tools https://pythonhosted.org/mediawiki-utilities/ and dumps at http://dumps.wikimedia.org/enwiki/20140402/. There is a subset of articles with this template (e.g. starting point https://en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/Template:Citation_needed&limit=500). These could go into topic categories.
  2. Quizzers (possibly having selected a topic) would be presented with a sentence, and asked something like “Is this claim true or false?” or “Can you find evidence for this claim”.
  3. They might have three options (with submessages in sublists below):
    1. What evidence can you find for this claim (Enter as many sources below as you like)
      1. How good a source is this?
      2. Can you corroborate the source?
    2. “I can’t get this one”
      1. I can’t find any information about this claim
        1. Open some of the options under ‘c’, perhaps ask how people have searched (or/and guide them)
      2. There is no claim made here!
      3. There are multiple claims made in this statement
        1. If multiple claims are made, can you rewrite the statement into separate claims, and find citations for each?
    3. “This might not be true”
      1. I found contradictory evidence for this claim
        1. Something about contradicting evidence, and weighing up
      2. This claim is outdated
        1. How should the information be presented?
        2. What evidence can you find for this claim
          1. How good a source is this?
          2. Can you corroborate the source?
      3. It might be more complicated than that (if the claim is more nuanced than made in the text)
        1. The information should be expanded and broken into separate claims
          1. (See other sub-replies re: splitting suggestions and evidence for them)
        2. The information isn’t as general as presented, further constraints should be added (e.g. it is only true to particular geographic areas, groups of people/things, etc.).
  4. Ideally there would be some training examples (pre-written) which would be diagnostic for each of these issues. They might be extracted from wiki-data/reasonator, or just manually written plus…:
  5. In addition, as each claim is ‘answered’, they further ‘seed’ the ‘training’ examples. Contradictions in multiple responses to questions could be settled somehow or removed from the question pool (they could be harvested off for Wikipedia community input).
  6. Ideally, there would be some mechanism for the information to go back into Wikipedia, either by the quizzer editing the claims (this is a good onboarding technique!), or by creating a list of 'answers' which could be used by other editors, or automatically. This isn't fundamental to the basic idea here.
  7. Ideally users would get points for:
    1. Answers given
    2. Answers actually used in Wikipedia (for which 5 would be needed)
  8. There would be scope to extract already referenced claims, and use the structures in Wikidata (e.g. http://tools.wmflabs.org/reasonator/?&q=254 when did the Austrian composer, born 1756, die) to set other very answerable tasks of the kind found in a google a day. This would help with 'training' examples.
  9. There are also lots of deadlinks, or raw-URLs used as references, and cases where a reference is given but the reference actually doesn’t support the citation – longer term exploring that area would also be interesting.

Steps

[edit]

Very drafty:

  1. Extract/write some diagnostic material (for training, e.g. we want to be able to feedback a more or less 'correct/incorrect' on these questions)
  2. Extract 'citation needed' paragraphs
  3. Use e.g. the Oppia platform to setup quizzes (sophisticated platform, very flexible quizzing, open source)
  4. 'State 1' branches depending on the type of input given (e.g. whether it is a diagnostic or 'unknown') loading a random parameter (or, within particular limits e.g. category filtering) should work for this [1]
  5. Create 'states' for each possible response, with branching off those as above
  6. Save answers against the 'citation needed' paragraphs in a way which a) collates answers across all quizzers, and b) could be taken back into W (parameters may solve for this?)
  7. Create scoring system for each user (e.g., no. of questions gone through with answers given, and then bonuses earned based on whether the template is 'resolved' by answer, or whether answers broadly agree with other quizzers...

Project goals

[edit]
  1. Create a game to improve information literacy
  2. Improve Wikipedia quality (by addressing issues around citation)
  3. Use the sizeable open knowledge resource that is Wikipedia to generate a dynamic and interesting 'game'

Get involved

[edit]

Welcome, brainstormers! Your feedback on this idea is welcome. Please click the "discussion" link at the top of the page to start the conversation and share your thoughts.


Does this idea need funding? Learn more about WMF grantmaking. Or, expand to turn this idea into an Individual Engagement Grant proposal
Step 1. Change your infobox from IdeaLab to IEG:

Step 2. Create the rest of your IEG proposal:

Ready to create the rest of your proposal?
Use the button below just once to create the remaining sections you'll need!