Talk:Survey of how money should be spent/Questions

From Meta, a Wikimedia project coordination wiki

List of language names (in which one do you contribute ?)[edit]

On the opposite, the small list containing a few major language names should be preferably translated using the MediaWiki support for {{#language:code-of-designated-language|code-of-translation-language}}, which should support already all these major languages (and probably as well most languages in which MediaWiki has been localized, and then for which there's at least a Wikipedia edition). That Mediawiki extension will retreive the translated language name itself (otherwise it will display the language name in English, or the untranslated native language name which is used by default, otherwise the code of the designated language).

Those language names are aleady translated with the rest of the MediaWiki code, but in fact are first translated in source code by importing CLDR data, before local patches for data missing or inconsistant in the CLDR database.

verdy_p (talk) 04:52, 28 February 2012 (UTC)[reply]

You are right. I didn't include the country list in the translation because that can be taken from CLDR; the same should go with the languages, so I'll remove them from translation. Jon Harald Søby (WMF) (talk) 19:23, 28 February 2012 (UTC)[reply]
Will the words "List of 20 languages" not be translated ? --Arno Lagrange  22:58, 5 March 2012 (UTC)[reply]
fr : liste de 20 langues
eo : listo de 20 lingvoj
No, those labels do not need to be translated. Thanks! Cheers, Stephen LaPorte (WMF) (talk) 23:23, 5 March 2012 (UTC)[reply]

List of Wikimedia project names[edit]

Here also, this is a common list that should be translated from shared resources. Not changed every time. Having to retranslate them each time is a loss of time for translators. Those names are present on the Wikimedia sites themselves, just read them and use their existing Project name (as seen in their namespace name equivalent to "Project:"), and on their logo banners (but this is more difficult to parse by an automated tool). — verdy_p (talk) 07:28, 28 February 2012 (UTC)[reply]

One of the tools we are about to implement is a "translation memory". When a text is identical it will be suggested to used the existing translation. Thanks, Gmeijssen (talk) 10:18, 28 February 2012 (UTC)[reply]

Côte d’Ivoire[edit]

The source English name of the country name is incorrectly encoded for the "ô" letter. verdy_p (talk) 04:16, 28 February 2012 (UTC)[reply]

I've fixed it in the source base page. This may not work as intended. verdy_p (talk) 07:09, 28 February 2012 (UTC)[reply]
Thanks! Country names will be taken from CLDR as well, so you won't have to translate the entire list. Jon Harald Søby (WMF) (talk) 19:23, 28 February 2012 (UTC)[reply]

List of country names (where do you live?)[edit]

But I see that this list of country names is not proposed to translations, where it should.

  • The actual list of translations may come either from the CLDR project (if available), or from the various translations available of the w:ISO 3166-1 page in Wikipedia. Note that targets of links may append an additional, but unnecessary, disambiguation suffix (between country/state/city names). Also the formal long names used in the list displayed here may not be part of the translated Wikipedia pages, and the links going to target articles per country may not always reverse an initial disambiguation formal prefix which will still be needed for some countries like "Korea" and "Congo".
  • Note also that the translated list should be sorted according to the target language and its sorting rules. For this reason, it won't be enough to just translate individual country names, but this could make the survey form more complicate to generate and parse, because items will be different and in different order. For this reason, those items should be identified by a code (most probably the ISO 3166-1 code, used as the source of the translation, and in the form data submited to the server when replying.
  • If you don't want to fix the sort order according to the target language rules, sort them according to their ISO 3166-1 code, and display that country code at the begining of list items of the survey form, within en embedded span inserted before the translated country name, like:
    <li><label for="AF"><input id="AF" type="checkbox" /> <code style="unicode-bidi:embed">AF</code>{{int:colon}} {{int:country-name:AF}}</label></li>
    <li><label for="ZW"><input id="ZW" type="checkbox" /> <code style="unicode-bidi:embed">ZW</code>{{int:colon}} {{int:country-name:ZW}}</label></li>

verdy_p (talk) 07:09, 28 February 2012 (UTC)[reply]

Yes, the country names should be taken from CLDR too. About sorting, I'm not sure how it should be done technically (not sure if the survey software supports that, though it should), but the alternative of sorting by (and showing) ISO 3166-1 code is a good suggestion. Thanks. Jon Harald Søby (WMF) (talk) 19:25, 28 February 2012 (UTC)[reply]

The 3 lists of activities / resources / actions[edit]

For the 3 last questions, please just keep the first list of actions, drop the two other lists that should be identical. Different translations in those lists there won't help make the survey coherent ! — verdy_p (talk) 07:10, 28 February 2012 (UTC)[reply]

Thanks, that was an oversight. (The other) Philippe fixed it. Jon Harald Søby (WMF) (talk) 19:29, 28 February 2012 (UTC)[reply]

"chapter committee"[edit]

What does "chapter committee" at Questions/47 refer to? Chapcom? Or something else? Thanks! --Aphaia (talk) 18:57, 28 February 2012 (UTC)[reply]

I would assume so (I don't know of any other chapter committees...), but I'll ask Philippe to confirm. Jon Harald Søby (WMF) (talk) 19:28, 28 February 2012 (UTC)[reply]
Yes, that's Chapcom.  :) Philippe (WMF) (talk) 20:02, 28 February 2012 (UTC)[reply]
Thanks, and now another question popped up on me: do "OTRS permission channels" refer permission queues or something else? Thanks in advance! --Aphaia (talk) 20:48, 2 March 2012 (UTC)[reply]
That's the permissions queues :) --Philippe (talk) 14:18, 5 March 2012 (UTC)[reply]


Nitpick: I think you mean "databases of journal articles" not "databases of journals" :) (or, you might mean "databases of article citations" but that seems a little obscure). your friendly neighborhood librarian, -- phoebe | talk 03:43, 29 February 2012 (UTC)[reply]

You are probably right, so I changed it. :-) Jon Harald Søby (WMF) (talk) 13:27, 1 March 2012 (UTC)[reply]


  1. the survey doesn't work: when ranking the different use of money I get "Sorry, you cannot continue until you correct the following:Issue 1 Please rank between 0 and 10." however I ranked each option between 0 and 10
  2. this message is in English while I was using esperanto version (and half in English while using the French version)
  3. languages and countries lists are still in English

--Arno Lagrange  22:06, 8 March 2012 (UTC)[reply]

Ideas for FDC use[edit]

I guess I've seen so many bad on-line surveys, that I didn't realize that there could be good ones. I looked over the April 2011 survey methodology and was quite impressed. It did seem to be going for big numbers of respondents, which has negatives as well as positives.

Quick negative: say you can get 80% response from group A, but only 40% from group B perhaps for language reasons, or because people in a certain area just don't trust opinion surveys (e.g. Chicago or Russia). Group B will have half the representation it should have.

One way to counter this is with stratified sampling which selects the proper sample size from each group (by limiting the overall sample). In effect if just puts predetermined weights on the different groups. If you want to go with the big sample sizes - I'd think just using the weights directly with the results from the groups would be fine. How do we pick the weights? For an editor survey - just use the number of active editors in the last month. For a reader survey - number of unique visitors in the last month. In any case that part shouldn't be controversial.

Basic format - keep it very, very short so that there are few incomplete responses. Don't ask questions that you're not ready to act on. The "assign an amount that you'd spend on each of the following" question (in the 2011 survey) looks very good. Have, say, 10 possible uses derived directly from the Strategic Plan (but a bit more specific), e.g.

  • Programs designed to retain all types of editors
  • Programs to retain women editors
  • Programs to recruit all types of new editors
  • Programs to recruit new editors in the Global South
  • Programs to help Chapters to grow
  • Programs to help new chapters to grow.

Now it's possible that the survey results could totally reject both "retain all types of editors" and "retain women editors", but extremely unlikely. If I remember correctly, "retain women editors" is part of the Strategic Plan - so we are committed to do something on this line. If somehow "retain women" is totally rejected, but "retain all types" is not - at least there is a fall back where we could have some programs designed to retain women.

It will be important to say very briefly upfront that the results will be very important, but will not determine the entire budget. The reasons are probably well known here. Editors don't know what potential projects have been proposed; there are mandates in the Strategic Plan, etc.

That said we really would have to follow fairly closely the results. If Program A gets 10% of the budget in the survey, it shouldn't upset anybody if the FDC chooses 5% or 15%, but editors would be justifiably angry if the FDC chose 35%. Having the survey will allow EEs to propose projects that fit into the survey results. It will allow us to judge which projects are not supported by the community, and which are most supported, and roughly order other projects in between. Promising or even implying anything more than that would be self-defeating.

More tomorrow. Smallbones (talk) 19:29, 29 May 2012 (UTC)[reply]

For the main question "assign an amount that you'd spend on each of the following (totaling $100)" I'll put some possibilities below. It's important that we cover almost everything that we expect to spend money on, but some of the ones at the bottom might be just too small.
  • Community work aimed at supporting a healthy editing culture
  • Community work aimed at attracting/supporting new editors globally
  • Community work aimed at attracting/supporting new editors specifically in the Global South
  • Community work aimed at attracting/supporting new women editors
  • Grantmaking to chapters, individuals and similar organizations for their own priority projects
  • Support for chapters’ infrastructure, including meet-ups and public outreach.
  • Supporting the development of partnerships with cultural institutions.
  • Supporting the development of partnerships with universities and educational institutions.
  • Provide support to the academic community who research Wikipedia and its community.
  • Support reviews of article quality by readers and subject-matter experts.
  • Support the advancement of legal conditions that enable unimpeded access to information online, worldwide.
  • Investing in offline products to broaden the movement’s reach to populations who will remain disconnected from the Internet.

Is it possible to get software that will force the total to $100, so that respondents don't have to spend the time calculating the total. Of course they'd have to review the new numbers before submitting.

On the question of sample size - is it more expensive to run the survey for 5,000 -10,000 (like the 2011 survey) or for just 500-1,000 with stratified sampling? "Expense" might also include the time and hassle to editors of filling out multiple surveys over time. With sample size 1,000, the margin of error would be about plus or minus 3%, going up to 10,000 respondents reduces the margine of error to about 1% - do we really need that extra degree of "accuracy"? The tradeoff being that with a smaller sample we'd get something that looks more like a random sample, whereas the larger sample will almost inevitably be biased, probably to the more active language projects.

Note that I haven't included anything on the above list about technology spending. Presumably that is part of the core that the board will decide - learning about this won't help the FDC. It might seem more friendly to respondents if we put in something though, maybe "The Board of WMF is committed to spending as much as is necessary to insure that out servers work quickly and have minimal downtime. Similarly, all reasonable development of new technology will the supported. The WMF currently spends xx% of its funds on technology. Would you:

  • spend a smaller percentage on technology?
  • spend a larger percentage on technology?
  • leave technology spending at about the current level?

Those are really the only questions that the FDC should care about. Perhaps some questions on did you donate? why or why not? There is always a temptation to add more and more questions, but the fewer questions asked, the more likely the respondents will complete the survey - making it a more like a random sample.

The only additional questions I'd ask are a few basics on age, educational level, country of residence - just so we can check if this sample is similar to past samples.

Smallbones (talk) 04:44, 31 May 2012 (UTC)[reply]

Thanks for all of the good thinking on this question. The approach you discuss above is very consistent with the approach used in the editor surveys that WMF has conducted twice now in April and November 2011. We have the ability through these surveys to get a strong stratified sample of the global editor community with limited cost, as we intend to continue running these surveys on twice a year. The survey is really valuable as we run in in over 20 languages and the sample sizes are over 5,000. However, we plan to reduce their scope to a narrower set of questions that are easy to work with. The prior surveys did ask a similar question to the one you are designing and there was a debate about the question design between the first and second versions. Here is the language of the question from November (and the full survey):
Q30. We are interested in your opinion on how the Wikimedia Foundation should spend money. If you donated 100 dollars to the Foundation, how would you like the foundation to allocate money for the following? (Please ensure that all the responses add up to $100.)
   a. Technical operations (more operations staff, new caching servers, performance metrics, uptime)
   b. Technical features development for EXPERIENCED editors
   c. Technical features development for NEW editors
   d. Technical features development for READERS
   e. Community work aimed at attracting/supporting editors globally
   f. Community work aimed at attracting/supporting editors in Global South
   g. Community work aimed at attracting/supporting editors in my country
   h. Grantmaking to Wikimedians or groups like other non-profits
   i. Support for other sister projects, not just Wikipedia
   j. Other, please specify: ______ 
I'll ask Ayush Khanna to post some information and links on this discussion and will also have him provide input on the recommendations and thinking above.--Barry Newstead (WMF) (talk) 17:10, 1 June 2012 (UTC)[reply]
Here is the link to that discussion about the rewording of the question for the second edition of the survey: Fundraising and Funds Dissemination/Survey Question. (As far as I know, it was only promoted on the Internal-l mailing list, and correspondingly there doesn't seem to have been much input from the non-chapter/non-WMF editing community, but it is safe to say that chapters representatives were made aware of it and contributed to the discussion.)
Also relevant are Sue's comments at User:Sue Gardner/scratchpad/Movement Fund-raising and Fund-disseminating#What do editors want to spend money on?.
Regards, Tbayer (WMF) (talk) 17:50, 1 June 2012 (UTC)[reply]
I'll try and address your concerns one-by-one:
a. Re: stratified sampling, we tried something of the sort for previous editor and reader surveys. We had set targets for regions and/or language Wikipedias, and we ran the survey banner for separate durations of time on different Wikipedias. Some learnings for us were:
  • Oversampling to account for bad quality responses
  • Have a "priority list" of groups: For some of the relatively smaller groups, we did not reach a statistically significant sample, despite running the survey for over two weeks.
b. To your question about forcing the total to $100, if we use Qualtrics (the tool used for the December 2011 Editor Survey) this is trivial. Other survey tools offer similar constructs too.
c. Re: the technology spending question: I would also randomize display of the three possible answers.
d. Re: the demographics questions, I would recommend using D9 through D15 from the December survey questionnaire. Akhanna (talk) 20:12, 1 June 2012 (UTC)[reply]

I looks like I really haven't thought of anything new, perhaps only in separating off the technology spending question. I think this is important because a) the FDC has no control or influence over core spending and b) the tech spending choices might overwhelm the attention of the respondents, e.g. after they've "spent" $45, much of the rest will be viewed as "small change" and they just won't be careful in allocating the last few $. Having more than 4 or 5 non-tech categories to "spend" on, is also important. I'm sure the FDC will be looking at more than 5 categories of spending.

As far as the priority lists, I'd go down the list of language versions by number of active editors. Probably the 10 most active versions would have 90%+ of the active editors (this is a pure guess). The other versions could be put into one super-category "other" - not all versions will be sampled, but have 5 or so to get the 50 or 100 responses you need, if after 3 days it looks like you can't get enough responses, send a pick-your-favorite-language (out of the 15 or so languages offered already) version of the survey to 5 or 10 more language versions. Oversampling is interesting - I'm not sure I'd know how to explain it to Your Average Wikipedian, in fact I'm not sure I understand all the implications myself. One thing that might appear odd to YAW is that the response from active-version editors (less oversampling) would probably "count less" than the responses from lower-activity versions (more over-sampling). If so, I'd like to have a very convincing non-statistical argument ready in favor of doing this.

Finally, I think I have to apologize - I've come in here saying "I have some great new ideas" and see that they've already been discussed in detail! Smallbones (talk) 00:24, 2 June 2012 (UTC)[reply]

Smallbones, no need to apologize. The thinking is valuable as it is always good to have someone take a fresh look (reaffirming the direction is valuable) and we'll be evolving the questions a bit and will benefit from the way you've articulated the needs of the FDC. Thanks! --Barry Newstead (WMF) (talk) 21:40, 4 June 2012 (UTC)[reply]