Jump to content

Research:Wikipedia Editor Survey 2012/Technical notes

From Meta, a Wikimedia project coordination wiki

Notes about the methodology of 2012 editor survey and some technical issues that had to be solved.

Translation process


The questionnaire was translated in 16 languages, using the translation system on Meta-wiki. Since much of it was repeated from the preceding (December 2011) editor survey, we transferred existing translations from that survey via a bot in order to save translators work and to maximize consistency (see also phab:T48645#502960).

To efficiently transfer completed translations into Qualtrics (this was earlier done by manual copy and paste) we used a little Python script that exports a translated version of the questionnaire from Meta into an XML file which can then be uploaded to Qualtrics. (The code hasn't been published yet, e.g. because it reuses some third-party code whose license status would need to be double-checked, but Tbayer is happy to provide it to others faced with the same task - it needs some initial setup to match translation unit numbers to Qualtrics' question labels, but then it can save a huge amount of time and errors, compared to manual copying and pasting. It has been successfully used for 2014 Global South User Survey.)

Distribution: CentralNotice banners


The survey invitation was distributed to logged-in users via CentralNotice banners on Wikipedia and (this time also) on Commons. We were able to reuse most of the design and code for the banners of the preceding two editor surveys.

Participation via this process is voluntary, which can lead to participation bias (as examined in detail in a 2013 research paper by Hill and Shaw, who e.g. confirmed the longstanding suspicion that female Wikipedians are less likely to participate in such user surveys. While their correction method can not be applied to the present survey due to a lack of comparison data, their results make it plausible that the gender percentages found in this survey are likely to be several percent lower than the real female ratio). Some users are were also opting out of banners entirely via a user preference setting, and therefore also of this survey invitation.

On popular demand, we decided beforehand to offer editors who wanted to participate in the survey but had missed the banners to send them a custom invitation link. This arguably violates the purity of the chosen sampling method, but as anticipated, it turned out to have a negligible effect (if any) on the results overall: less than 10 people made use of that offer.

Log: [link]

Banner impressions: ...

JavaScript variables used: ...

Response rates: ...

Survey platform: Qualtrics



Data cleaning


The survey received 24606 raw responses. Repeating the data cleaning process from the December 2011 survey with some additional checks (cf. below) yielded 17577 responses used in analysis of the WMF satisfaction part of the survey (which included Commons users and those who said that they had never edited Wikipedia) and 13838 for the part about users' experience editing Wikipedia, focusing on the part of the questionnaire that was conditional on the user taking the survey on a Wikipedia project and responding to the question "have you ever edited Wikipedia" with "no". Valid responses by users who did not complete the entire survey until the end were included in the analysis.