Research:WikiGrok/Test1

From Meta, a Wikimedia project coordination wiki

Test 1: all users, alpha/beta site, en.wiki[edit]

Test begins: Tue, 18 Nov 2014 01:00:00 UTC
Test ends: Tue, 02 Dec 2014 01:00:00 UTC

Sampling[edit]

This first test targets logged-in and anonymous users of the English Wikipedia opted into the Beta or Alpha mobile site. At the start of the test, users are randomly assigned to one of two buckets via a token that persists across sessions (clearing the token resets the bucket assignment). The test lasts for 2 weeks since the beginning of the data collection.

Treatments[edit]

Users who are in the pool of eligible participants see one of two versions of the WikiGrok widget when landing on articles where WikiGrok is activated. The start and end of the workflow is identical in the two conditions and consists of:

  1. a landing screen with a call to action that the user needs to accept in order to proceed to the next step
  2. a form with a WikiGrok question, the design of which depends on the experimental group the user is assigned to
  3. a confirmation screen, displayed after clicking on the submit button and successfully storing a response (including a "Not sure" or NULL response).

The WikiGrok question is the only element in the workflow that varies across conditions and it consists of:

  • A binary question for group A
  • A tagging task with multiple possible values for group B. The number of values displayed for each property is capped at 4.

Claim selection[edit]

Both WikiGrok versions show on biography articles where a question about occupation or nationality or alma mater can be generated. The total number of eligible pages is between 600,000 and 1 million.

  • Occupation
    • Item eligibility: instance of human (Q5); not an instance of disambiguation page
    • Potential claims generation: Find any links in the article that correspond to a Wikidata item that is an instance of a profession (Q28640) or subclass thereof. Note this does not include items that are instances of an occupation (Q13516667), which is a subclass of profession. This is because the occupation property and occupation item do not have the same definition currently.
  • Nationality
    • Item eligibility: instance of human (Q5); has birth place claim; born after 1750
    • Potential claims generation: Take the item’s birthplace in Wikidata (which is usually already set). Get the tree of the ‘is in’ or ‘country’ properties of the birthplace. Find any that are instances of a country.
  • Alma mater
    • Item eligibility: instance of human (Q5);
    • Potential claims generation: Find any links in the article that correspond to a Wikidata item that is an instance of a university (Q3918)

For a list of possible questions and corresponding article selection criteria see this page.

Limitations[edit]

  • We are not explicitly collecting data yet on how many possible values can be extracted from an article with WikiGrok enabled. In version A, if more than one value can be extracted by the widget, the widget will display a randomly selected value from those that are available. In version B, if more than 4 values can be extracted for each property, a random selection of 4 values will be displayed. Note that without knowing the distribution of potential properties that can be extracted per article and per property we might display the true values to users with a very low probability.
  • In each condition, at least one possible value for at least one property needs to be available for the corresponding question to be displayed.

Data QA[edit]

  • Data quality issues for test 1 are tracked here.