Jump to content

Global Reach/Nigeria Survey Documentation

From Meta, a Wikimedia project coordination wiki


Nigeria phone survey 2016[edit]

In the spring of 2016, the WMF partnered with Votomobileand conducted a phone survey to learn more about technology and Wikipedia use in Nigeria.

The 19 questions in the survey covered:

  • Internet use
  • Mobile phone use (smartphones & basic voice/SMS phones)
  • Awareness and use of Wikipedia
  • General demographics

This was a large-scale IVR phone survey, gathering over 2700 completed survey responses from randomly generated numbers across Nigeria. Voice (IVR) surveys were chosen to include respondents who may not have internet access. This approach allowed us to measure internet and smartphone penetration, along with answering other Wikipedia related questions. Also, the scale and methodology of the survey kept the margin of error low (<2%) for questions asked of all respondents.

Questions this survey was designed to answer[edit]

  • What is the actual number of people who use the internet?
Real-world behavior makes this difficult to measure from industry reports, since people might have access to the internet through school, friends, internet cafés, public Wifi, etc.
  • What do people mostly use the internet for?
  • How many people use smartphones?
  • Do people with smartphones use the internet from just Wifi? Or just cellular service?
  • How many people thought they didn’t use the internet, but do use Facebook or WhatsApp?
  • How many people have heard of Wikipedia? What do they use it for? How often?
  • If they have heard of Wikipedia, but weren’t using it, why not?

Goal: Represent the population of Nigeria[edit]

To get the most representative data possible, we worked with Votomobile to conduct a phone IVR survey. The reach of a phone survey can encompass nearly the full spectrum of age, gender, geography, income and education levels. For Nigeria, the survey generated random phone numbers which were assigned to mobile phones.

For proper statistical validity, our survey size of 2700 completed responses is large enough where the questions asked of all respondents have a 95% degree of certainty of being accurate within a 2% margin of error.

The survey was recorded in 4 languages - Hausa, English, PIdgen and Yoruba.

Addressing Bias[edit]

One issue with phone surveys is the tendency for some respondents to favor the first response to a question. To address this problem, most of the survey questions presented the responses in a random order for each call. This distributes any bias evenly among the responses instead of accumulating it all on one response. Note that questions that have a 'none of these' or 'other' response always kept this option as the last one presented.

A couple of survey questions, however, have a strong order dependency of their responses and are confusing if they are presented in a completely random order. For instance, when we ask how often they use Wikipedia, asking in a non-sequential order would not make sense (e.g. an order of “once a week”, “once a month”, “once a day”). For these questions, we would randomly present the question in one of two orders: either from lowest to highest, or highest to lowest.

Where to get the data[edit]

  • This page shows graphs of the responses received for each question in the survey.
Flow diagram of survey questions
  • The full data set can be found at:
Dan Foy (2016). Nigeria phone survey 2016. figshare. doi:10.6084/m9.figshare.4620595
This is the canonical version which contains a CSV including every answer from each of the 2700 responses.
  • The full text of the questions can be found here.

Using the data[edit]

  • The questions asking if the respondent uses Facebook or WhatsApp are only asked if they previously said that they do not use the internet. This is by design - we wanted to use this question to gauge how many people did not understand that Facebook was part of the internet. The responses to these two questions were not intended to measure the full use of Facebook or WhatsApp.
  • It’s important to note that this survey is not linear. Depending on how a question is answered, the flow of the rest of the survey may change. For example, if a respondent says they do not have a smartphone, we skip the smartphone related questions. You can review the flow diagram in to see how the survey progresses.
  • Within the CSV file, each row represents one survey taken, with each column containing the response to the associated question. In certain cases, some questions that should have been asked were not, and these entries are marked as 'Missing Gx'. The number after the G indicates the group of questions that were not asked for that particular respondent.
  • The original run of the survey had one problem the logic for the Facebook, Whatsapp, and what people used the internet for questions were inverted. Therefore, this branch of responses is not valid data and is marked as 'Missing'.
  • A second, supplementary survey was later run to gather correct responses to the previously skipped group of questions. These results are in the second CSV file, named as containing 'original and additional data'.

External links[edit]