Grants talk:IdeaLab/Characterization of Editors on Wikipedia

From Meta, a Wikimedia project coordination wiki

Concerns about this Idea[edit]

I understand that this IdeaLab is meant to improve Wikipedia, but I'm not sure as to how it would. Additionally, there are some issues with respect to the clarity of the the Idea. I'll address the concerns I have with project first, then proceed to the matter of its need for some elucidation. Lastly, I'll address the goals of this Idea. Although I disagree with this entire campaign as a whole, I always support the drive for more research. If this research could elucidate on anything, that would be great. I'm just worried that this Idea might not, at least not in its current iteration. Having said that, don't let my criticisms below deter you. My hope is that the following helps to refine this Idea, not to convince you that it's futile.

  • First, thank you for taking the time to leave such detailed feedback about my idea. I'm a strong advocate for academic discussion about research topics during the planning stages and well before the implementation of any proposed project. I'm working to further develop this project and am incredibly thankful for those opportunities via the IDEA lab. That being said, I'd like to address some of your concerns directly in the text (please see below). Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    Thank YOU for reading all this! I appreciate, moreover, your willingness to correct me and explain where I was mistaken, since this has certainly helped me (and perhaps others) with understanding this Idea. I hope my criticisms haven't been too harsh, and perhaps even helpful to this Idea. As a preliminary, any responses you made that I did not respond to is because I either agree and/or have nothing further to add. I made sure to read every single one, though, and respond where I thought was needed. –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]

The Problems with this Idea[edit]

In my opinion, there are numerous problems with this Idea, some of which are structural and some of which may even be fatal. These problems need to be addressed before I would even consider endorsing this Idea, and should be in order to ensure that the Idea is relevant to the Inspire campaign as well as beneficial to its goals.

The Idea itself[edit]

Firstly, how would this Idea meaningfully impact the campaign and its goals? More research is great, but what would be the implications of this research? For example, how would the information about editor demographics address or impact the matter of Wikipedia's gender gap? How would it inspire more females to become Wikipedians? Concerns regarding this initiative aside, I'm not sure how the accumulation of all this data could help address the "issue" posed in the IdeaLab. I'm all for more research, and I could see how more accurate and comprehensive research could be beneficial to Wikipedia overall, but I don't see how this Idea would help address it so much as it would refine the data supporting it. Perhaps I'm missing something, but your proposal doesn't elucidate on this either.

  • Although we have some research on the gender gap, I don't believe that we have enough detail to fully comprehend this gap and its potential impact on not only the online Wikipedia community, but other online, knowledge-building communities. Women are on Wikipedia and we know that they are actively contributing, even if to a lesser extent. However, we don't have strong data that captures the demographics of the "super-editing" group and what it means to be a "super-editor." As a potential flaw in many studies exploring online data, frequency rates can be measured in a variety of ways via surveys. Consequently, online research can find differing results based on inconsistencies in measurement across studies. I'd like to build off of the original Gender Gap study to include explicit and detailed items that assess frequency based on volume and considering timespan of edits. Overall, if we can't accurately conceptualize a potential problem on Wikipedia, I don't believe that we can seek out interventions to address it. I'd be strongly interested in exploring (in more depth) the barriers that women may encounter online and whether these barriers might exist with other demographic groups. If we were able to identify themes based on demographic information, we would be in a significantly stronger position to address the problem through more comprehensive interventions.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    Thanks for your explanation. If I'm understanding you correctly, you believe this data is meaningful because it may help Wikipedia better understand the extent to which women are contributing to Wikipedia, and not just the fact that they're contributing. For example, if the analysis showed that most of the female contributors are largely inactive or only sporadically active, this could be important because it could show that not only is there a gender gap with respect to contributors on Wikipedia, but also a gender gap with respect to contributor activity as well. Likewise, if the data indicates that among the minority of female contributors, the majority of them are so-called "super-editors", this could have major implications on the meaning of the gender gap. If this is what you mean, then I can see how this data could be meaningful, since it helps define the userbase more comprehensively, which includes defining the female demographic furthermore, which in turn could benefit the Inspire campaign by providing a more accurate and comprehensive understanding of the activities of the female demographic (and how it relates to other demographics, especially males). –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]

Secondly, why is this data meaningful outside of statistical purposes? Maybe I'm mistaking the point of this Idea, but it appears that this proposal intends to utilize the data to predict user behaviors, which—seeing as the data being collected is, in my opinion, insufficient to adequately predict the behavior of the editors—seems awfully a lot like profiling. How exactly is this Idea different than one calling for the profiling of the editors? On a related note, but not to be confused with the process of profiling, couldn't much of the same information be acquired by the addition of user profiles? Since this is a matter of personal data regarding particular elements of a user providing by the user, I don't see how most of this project could not be determined by simple data collection of user profiles once implemented (assuming user profiles had to be filled out by the user).

  • I often avoid using the term "predict" because of these connotations, but the data drawn from my proposed project would (hopefully) capture some of the demographics that we see in varied types of editors. If we can identify common traits across the super-editor group, we could better cultivate or build upon these inherent traits to develop tools for individuals to increase their editing volume based on needs. For example, if we know that active editors on Wikipedia feel a strong desire to "give back" to the community, we could leverage similar traits in offline communities to further develop our online community. In addition, if individuals want to "give back," but are in the inactive-editor group, we could develop tools that might provide these editors with avenues for effective contribution on Wikipedia. I see this as a potential area of growth for Wikipedia in general and an opportunity to expand our training materials or methods of engaging the inactive-editor group.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    I can understand and agree with that. I assume that this information and these implementations would benefit the Inspire campaign by providing potential solutions for all users, including female users. For example, if the data indicates that female users in particular have a strong desire to "give back" to the community, but they are in that inactive-editor group; and moreover there are common patterns or frequently cited problems cited in open-ended responses, such as how many females avoid "giving back" because they find the community to be hostile in the areas wherein they wish to "give back"; one could conclude that perhaps the Inspire campaign should focus more on resolving the issue of community hostility in those regions as a means of improving the chances of increased contributory activity (in the form of "giving back") and membership within the female demographic. –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]
  • I also think that the profile data that individuals provide on Wikipedia may not accurately capture who they are or why they edit. This appears to be particularly true for the inactive and active-editor groups.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    That's understandable, I suppose. I assume that a comprehensive study would be preferred, since there is more control over the data being provided and a clearer understanding of what is needed (not to mention the fact that the honesty of a user profile is dubious at best in many circumstances). –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]

Thirdly, and this is an issue inherent in most polling or data collection processes involving humans, how would you ensure that the data is accurate? People could easily lie or provide incorrect information when polled, which then skews the data. Unless measures are taken to ascertain the real identity of the person—which requires forced deanonymization, something I doubt many would support, and which I strongly oppose—I don't see how you could safely report the data collected as accurate. Then again, I suppose similar questions and criticisms could be raised about the data already acquired by Wikipedia. Since this Idea seems to replace this old data, however, it would be important to ensure the veracity of the data collected. In order to do so, however, some form of deanonymization has to occur, which I doubt Wikipedia or, more importantly, Wikipedians would support. Simply assuming that the answers provided are honest as determined by some code of conduct is convenient and all, but it isn't very credible when data is concerned. Approximations are understandable, but when the very integrity of the data is itself in question, one wonders whether any of the data collected was actually meaningful, or even worthwhile.

  • This is probably where I agree with your response the most strongly. Survey data, as with any other data collection technique, is limited in the types of inferences we can draw from such data. Scientists have also argued in the flaw of interview data due to its limited external validity (generalizability) as a consequence of smaller sample sizes. We often hope that participants are truthful on surveys, but understand that not everyone will provide accurate information. That being said, I believe that the Wikipedia community as a whole would only take the survey if they wanted to give out their demographic information and I can't imagine why they would profit from providing us with false information. This study would require rather large sample sizes to account for these potential flaws in the data where we can account for some degree of variation from the norm.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    My only suggestion in this regard, then, would be to perhaps consider safeguards or monitoring of data to watch for inexplicable or odd spikes in the data. There is always the chance that this study could be the target of concerted trolling and disinformation attempts by certain users or groups to push a given agenda or seed chaos. This is especially true if this study will occur on a massive scale and would be popular enough that anyone who uses Wikipedia has a significant chance of catching wind of this. Too many times have online studies or polls been completely dismantled because of the dedication of a particular group or site. –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]

Elucidation needed on the Idea's solution[edit]

From what I've gathered, your proposed solution is for a more comprehensive research study of Wikipedians and editors with data on certain traits of said individuals collected, presumably with the goal of predicting the behavior of these persons and determine if any solutions could be found by interpreting this data. (If I am wrong, please correct me, since this is fundamental to my criticisms.) My issue is that the data you propose needs collecting is either irrelevant to the behaviors of the editors; inadequate to predict their behaviors and/or point to a potential solution; or too vague to measure in their current iteration. I'll address these issues in order of the items you've listed:

  1. Frequency at which the editor actively edits on Wikipedia – How would frequency be measured? As an average of how many posts a person makes in a given amount of time? Would this include talk page or discussion contributions/edits, or only those pertaining to Wikipedia articles? Would this frequency be with respect to Wikipedia editions, or editions and contributions on all Wikimedia sites and services? What is the definition of "frequency" in this context? This last question is particularly concerning, since not all users (perhaps all users) don't edit or contribute at predictable intervals. Many edits and contributions are, I suspect, sporadic and occur at the whims of the user. For example, a typo correction would occur whenever one is spotted, and does not typically occur because a user searches for one to fulfill a given interval at a regular point in time. Likewise with responding to other users and whatnot. Now, the activity of a user could be roughly measured, which may be what you mean, but this is different from the "[f]requency at which the editor actively edits on Wikipedia". Moreover, the inclusion of the term "actively" implies that this is only to measure "active" users ("active" being defined by criteria unspecified at this time), and not inactive or semi-active ones. Overall, this item is vague and ambiguous, which is certainly not something someone wants in data collection or analysis.
    • I'd be interested in differentiating between Talk Page engagement and Contributions/Edits on articles. In doing so, we could draw some conclusions about how editors are engaging with their fellow Wikipedians in comparison with edits directly on articles. I'd planned to ask about frequency based on the number of edits (i.e. Talk Page or Edits) in the past month or week. A follow-up question would then ask the participant whether they considered this an accurate description of their editing on Wikipedia. I'm a supporter of open-ended questions that allow participants to expand or contrast the information they provide in the closed-questions.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    Perhaps a combination of the two, wherein there are closed-ended questions with limited choices accompanied by an open-ended text box in which the user could elaborate and clarify, would be ideal. This would provide measurable data in the form of specific, closed-ended answers while also alloting further elaboration which could then be interpreted alongside or in relation to this data. –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]
  2. Gender – This is perhaps the only item listed which actually pertains to the campaign and may be important to determining the appropriate course of action. However, like I've already argued elsewhere, I don't see how gender would meaningfully alter an editor's habits. I understand the whole matter of how men and women are raised differently, are culturally and socially treated differently, and how they even think differently; but I don't see why it should matter whether the editor is male or female, since what matters on Wikipedia is the contributions of the edit (and not the one who made it). I address this issue and others in my criticism of this entire campaign, though, so there's no point in rehashing it here. One other matter of consideration is what you mean by "gender". Even the Inspire campaign seems to be misusing the term, since gender and sex are two distinct components of one's sex-gender identity. What exactly is meant when you say "gender"?
    • I would argue against your statement that gender doesn't alter how an individual edits. I think that many of the offline social norms of actively contributing to a knowledgeable community may be replicated online. As you mentioned, this might be a larger discussion with the entire campaign, but I know many Wikipedians that choose not to "out" themselves on Wikipedia as male or female because of their prior experiences editing male-dominated or female-dominated pages (when they had self-identified).Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    Perhaps you're right. I suppose my concern is about the lack of clarity on how one's gender or sex meaningfully impacts one's editing habits. There doesn't seem to be enough research into this, and a lack of clarification on the difference between the relationship between the editor's sex or gender and their edits, and the relationship between the former and their editing habits. For example, I mislabeled my point as being one of the editor's habits, when my point was more about how one's gender impacts their edits. That was a blunder on my part and it's erroneous to think that one's sex or gender does not impact one's editing habits, since the same concerns one might have in one's society or culture may also apply to their perception of Wikipedia, which in turn affects how they interact with it. For example, if a female editor may not be very outspoken about her views, this could be due to the stigma surrounding outspokenness in women in her society and culture. Even though the opposite is true on Wikipedia (at least, in theory), some women may refrain from "being bold" due to their cultural or social background and not Wikipedia. Then again, I'm not sure how Wikipedia could change this. –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]
  3. Age – What relationship does this have to the Inspire campaign? Moreover, what importance does age have on Wikipedia? I'm already aware of how thinking can change as one grows older, especially between adolescence and young adulthood (especially with the prefrontal cortex) and between adulthood and seniority, but how does this meaningfully impact the matter of only 20% of editors being female?
    • We don't know whether there are age trends in our editing groups. This measure is important simply because if we don't know who is editing, how can we seek to intervene and "enhance" our Wikipedia community? It's difficult to design tools/interventions for a mysterious group of Wikipedians. For example, I'd use a very different intervention or approach to intervention with a 60+ group of editors than I would with a 18-25 group.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    I suppose this is reasonable. If I may conjecture, the age of a particular female editor could be important to determine how to best approach encouraging more activity among this individual, since (like you said) a different intervention approach would be taken if the individual is a young adult as compared to a senior. It could also help to see which age demographics are most lacking in female editors, which could in turn determine where the Inspire campaign should set its focus. –Nøkkenbuer (talkcontribs)
  4. Ethnicity – This item seems even less relevant than any of the above. Why should one's ethnicity matter? Yes, ethnicity can influence how culture and society treats an individual, but this is largely irrelevant to the gender gap on Wikipedia or to the edits themselves. Does one's ethnicity grant expertise in said ethnicity? I don't think so, anymore than being female grants expertise in femininity and the state of being female, or being autistic grants expertise in autism. Moreover, what exactly do you mean by "ethnicity"? Will this be a list, comprehensive or brief, of general ethnic identities; or will users self-identify their ethnicity? Moreover, how would one determine whether one's ethnicity is accurately reported? This issue applies to all these items, but to this especially, since some individuals identify as an ethnicity culturally that they are not genetically, such as a black person or Hispanic person identifying as white, or even a white person identifying as black.
    • I often use the census data collection items to determine ethnicity (with an open-ended response item) and this should be measured because we don't currently have the data for varied types of editors. In reporting data from ethnicity items, it's often most appropriate to describe how the participants "self-identify." This very well might differ from their genetics or even their race. I don't see that as a problem, given that I'd be more interested in whether individuals that self-identify in one given ethnicity are more likely to make up the super-editor group. If we have a gender gap, how do we know there isn't an ethnicity gap as well?Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    My concern with those who self-identify with a particular race or ethnicity is that it could skew the data and the interpretation of it. For example, if a non-Hispanic Caucasian from the United States were to identify as an African-American, this could be important with respect to self-identification, but it may not accurately represent any discrimination or background the individual may have experienced. For example, an African-American may experience far more racial prejudice and discrimination, and racism, than might a Caucasian even if he or she identifies as an African-American. Likewise for many other disparities between self-identified race or ethnicity and the genetic or inherited race or ethnicity of the individual. Simply adhering to, or even being raised in, an ethnic culture that is not typically expected of someone of a particular race or ethnicity is not really meaningful from a data standpoint (in my opinion)—unless the data being collected pertains to the self-identification of one's race or ethnicity, of course. –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]
  5. Employment level and/or degree – This is somewhat relevant to the quality of edits, since a degree can indicate (though not necessarily imply) expertise in a particular area or field. It could also be relevant to activity or edit frequency on Wikipedia, since some NEETs may have more time to edit than, say, a professor or CEO. I don't see how this should matter with respect to the campaign, anymore than does age or ethnicity, though. This could be interesting to learn, sure, but it isn't really meaningful.
  6. Social skills – This may relate to civility and the behaviors of the editors in discussions and on talk pages, but I again don't see how it's meaningful or related to the campaign and its goals. This is also vague and would need to be clarified before it could be reasonably included as an item.
    • This variable is directly taken from the SRS-A, which is measure of autistic traits. It's a dimension of traits that are typically measured for individuals, but could be used as a more general measure of social skills. It would include willingness to engage in conversation with others, understanding of others' thoughts (theory of mind), etc. I'm happy to expand upon these categories.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    Thanks for clarifying. I wasn't sure what you meant by "social skills", since this could be a pretty broad matter. Feel free to explain further if you feel it's conducive to this discussion, though I could probably gather some information myself if needed. My primary concern was with what you meant by "social skills" and how this could be measured or determined. Usually, social skills are mentioned with relation to civility and etiquette, so I was unsure as to what you meant. –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]
  7. Restricted interests and repetitive behaviors (autistic traits) – This is a rather strange item. Not only could it be interpreted as offensive and discriminatory against autistic people, but I'm not sure why a diagnosis of autism on an editor would matter. It could certainly impact the editor's performance or editing habits, along with his or her activity and frequency levels with respect to editing and contributing, but it doesn't really provide any meaningful information that Wikipedia could use, unless it wishes to start promoting a neurodiversity stance with a pro-autism bent (or anti-autism, but I seriously doubt that will ever occur for political reasons). Moreover, what qualifies as "autistic traits"? A formal diagnosis of autism—and would proof of this be necessary, or simply the user's claim? Traits which "appear" autistic as determined by an analysis of their activities on the forums (which could be discriminatory)? What exactly do you mean by "autistic traits"?
    • There has been speculation that certain online communities may allow individuals with restricted or very focused interests to engage with these interests in a community of others with similar interests. These traits are in no way a diagnostic measure nor can they tell us whether the community has autism - this is not something we care about either. However, we know that autistic traits are found in the general (non-autistic) population and that these traits may influence how individuals may engage with their interest areas and others online. I don't plan to ask any questions about whether an individual self-identifies as "autistic," but am more concerned about the levels of traits as they might relate to varied types of editing.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    Thanks again for clarifying. When you mentioned "autistic traits", I assumed you meant it as in "traits exhibited by autists who are diagnosed as autistic". I hadn't considered the possibility of analyzing autistic traits in non-autistic people and how this could impact community involvement. I've admittedly only come across the distinction of "autistic traits" as not being exclusive to people with autism a handful of times, none of which I could clearly recollect, so I'm not too accustomed with assuming this as a possibility. This may just be my ignorance and inexperience showing, though. –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]
  8. Motivation for civic engagement (desire to "give back") – This could be tangentially related to the gender gap on Wikipedia, but I find it difficult to see how this data could be used to argue for a balancing of the genders unless we presume that females are more likely to exhibit traits of "civic engagement" (which is itself a presumption based on a stereotyping of females and femininity). Moreover, what qualifies as "civic engagement"? Does merely editing suffice, or must one also participate in the discussions and talk pages? How does one gauge "motivation"? What units of measurement would be used, if any, and how would relative motivations be compared?
    • This may actually underlie some of the gender gap that has been found. As we're looking at relationships between demographics and editing on Wikipedia, we should also measure for potential variables that might influence these relationships. We may find that high civic engagement is a stronger predictor of editing than is gender. By not controlling for this, we are still vague on how the desire to give back to a community influences Wikipedia editing.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    So it's possible that the real issue here is not the gender gap itself, but the underlying issues and relationships between other demographics and sex/gender which may be leading to the gender gap? Perhaps I'm misinterpreting you, but it seems to be that you're stating that this research would help determine whether the gender gap on Wikipedia is the real issue and not just a misinterpretation of the data based on a lack of data. In other words, the Inspire campaign may be better off focusing on addressing these underlying demographic relationships and concerns and not the potentially misleading statistic that only 20% of Wikipedia editors are female. If I'm mistaking the implications of your words, please let me know. I'd hate to jump to conclusions. –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]
  9. Barriers or perceived barriers that the editor has experienced (open-ended question) – This could be related to the campaign, especially if someone has experience sexism or sexual harassment, but it appears that this item would cover a much wider and more general array of topics, one which would effectively be an open-ended criticism of Wikipedia and Wikipedians as a whole. Unless there is more specification on what is meant here—in particular specification that this applies only to barriers or perceived barriers an editor has experienced in relation to his or her sex or gender—the range of answers to this item could be anywhere from the interactions an editor has had with other editors to issues with wiki-markup or official restrictions by Wikipedia and its staff. Also, what qualifies as a "barrier"? Or is this left up for the user to determine and define?
    • These items would be asked in the context of the other demographics. If the gender gap results are replicated, we might be able to explore themes that could explain the gap. Yes, we might get a rather large range, but we may also find that certain structural components of the site may influence access to Wikipedia or ability to edit.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
  10. Additional space for comments and suggestions for enhancing the accessibility and fluidity of the editing process on Wikipedia – The same criticisms for Item #9 could apply here. I would also question how this relates to the campaign and its goals, since improving Wikipedia as a site and the systems therein would be more of a general matter and not one specific to the gender gap currently present. Additionally, what do you mean by "accessibility" and "fluidity of the editing process"? Although these could be pretty clear by themselves, some appended examples could be useful.
    • Since we don't know what we'll find with these items, it's difficult to directly respond to the difficulties you identify. However, I'd hope that Wikipedians are interested in improving the site and accessibility of the site to all users. As a result, why not directly ask the users themselves what they would like to see? For example, I'd ask, "Do you have any suggestions for the Wikipedia community that might enhance your ability to edit on the site?" Or, "Do you have any suggestions for the Wikipedia community that might encourage you to more consistently edit on Wikipedia?" Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    I suppose my main criticism was with how this pertains to the Inspire campaign, but I honestly don't see how this really could. This last item seems like a concluding catch-all for any final remarks on behalf of the participating relating to Wikipedia and how it works in general. I don't see a problem with that, and in fact support it. I was just concerned that this was not pertinent to the Inspire campaign, though I now recognize that this research is meant to function both as a response to the campaign and a method of determining how Wikipedia could be improved overall. –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]

Overall, a general pattern I notice is that all of these items, however useful they may be, are ultimately superfluous to Wikipedia and its editors. Like I alluded to above, editors should be judged on the quality of their edits and contributions, not on their frequency, gender, age, ethnicity, employment level/degree, social skills, "autistic traits", motivation for civic engagement, or barriers or perceived barriers the editor may have experienced. Why should any of that matter? Is a contribution or edit of equal or even identical quality be somehow more meaningful because the editor is from a particular background, or is the inheritor of a particular genetic code, or is able to be identified as compared to anonymous? Sure, all these factors could contribute to the quality of the edit or contribution, but it should not determine it. Similarly, this information is inessential to judging the quality or integrity of any edit or contribution, and not only because the reigning lord around Wikipedia is the all-powerful [citation needed]. This data may be helpful for statistical analysis and compiling the demographics of Wikipedia, but I don't see how it could be used to imply or further anything other than knowledge about Wikipedians for the sake of it—precisely because it shouldn't matter with respect to the edits or contributions made.

[As a note, I noticed that this paragraph is largely extraneous and could safely be ignored.]Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]

I likewise don't see how any of this information could be meaningful to the campaign, since the only item on the list which applies would be #2 ("Gender"), and perhaps #9 if the topic or answer is gender-related.

The Goals of the Idea[edit]

These are about as important as the methodology used and the solutions proposed. The goals are, in my opinion, very admirable and ones which I could support. However, they aren't really relevant or specific to the Inspire campaign, which aims at "solving" the "problem" of a gender gap on Wikipedia. As I usually do, I'll address them one by one:

  1. Who is editing on Wikipedia and at what volume are they editing? – I think this is certainly interesting information to possess and a topic to investigate, but I don't see how this could benefit Wikipedia outside of its demographics statistics. Perhaps it could help Wikipedia know which demographics to appeal to more, but I would consider this unneeded bias which would only artificially balance any statistics which could be gleaned from the community. With respect to this campaign, it's important but not when considered in the context of the data that you propose to collect. Additionally, by specifying "Wikipedia" it implies that only Wikipedia will be targeted during this research. Was this intended? Or did you mean to target the entire Wikimedia Foundation?
    • Good point about the “Wikipedia” specification. Yes, I’d planned to target only Wikipedia, but would be open to exploring the foundation as well. I’m not entirely sure what you mean by “unneeded bias,” but would like to explore that a bit with you.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    By "unneeded bias", I essentially meant that the answer to the question representing this goal could be used by Wikipedia to know and determine which demographics needs to be appealed to, and that is "unneeded bias which would only artificially balance any statistics which could be gleaned from the community". This is a reference to my overall criticisms of the Inspire campaign, however, and thus could be safely ignored. I probably should have left that out, actually, since it doesn't really contribute to my point. –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]
  2. What are the personality characteristic of super-editors, active editors, and inactive editors on Wikipedia? – This, I suspect, is only important in conjunction with Goal #3, since this information by itself does not really convey anything meaningful outside of a broad and unsteady approximation of how certain users behave.
  3. How can the Wikipedia community support and further develop the inactive editors into super-editors? – This is the most intriguing goal and perhaps the single most revealing component of the entire Idea. It seems to me that the purpose of this Idea is to improve Wikipedia and its community as a whole. This will obviously have effects on the entire Foundation and the community which supports it, which in turn will change the dynamics of this entire campaign. Ultimately, however, this sounds like a general Idea on how to improve Wikipedia, and not one specific to the Inspire campaign.
    • I think that in order to accurately explore this goal, we must be able to better characterize our community. I don’t believe that we can effectively support our community without knowing who they are and the types of support they want/require.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    ...which can benefit the Inspire campaign because it could help Wikipedia understand how to best support females as a demographic as well as all the others, correct? If so, then I believe I understand what you mean, both in relation to Wikipedia as a whole and to the Inspire campaign, and I agree. –Nøkkenbuer (talkcontribs)

Support or Opposition?[edit]

Although I support these goals, I am reluctant to endorse this project and Idea because it is largely off-topic to the campaign and attempt to address an overarching issue which, although important in and of itself, only possesses the tangential chance of facilitating the attainment of the goals of this campaign. I support Goal #3 in particular, but this doesn't seem like an Idea suited for Inspire. This entire Idea, therefore, may be more conducive over at the general IdeaLab.


This Idea is certainly interesting and potentially beneficial to Wikipedia, but I see some structural, and potentially fatal, flaws in it which would need correcting before this Idea could take off. I certainly would love some more comprehensive and accurate research, if possible, and Wikipedia and its community; however, I don't see how it could benefit this campaign outside of either reconfirming or conflicting with current data.

Apologies for the lengthy response and for being so fault-finding therein, but I hope that this critique will help and spur an improvement of this Idea, if only so that more research could be done.


Finally, I should probably provide some constructive suggestions, since all I've done thus far is to write an entire dissertation and essay criticizing and deconstructing this Idea. Some possible improvements to this Idea that I think might help are as follows:

  1. First and foremost, you must determine whether the goals of this Idea are suitable to directly furthering the Inspire campaign, or whether they are better suited for another, or more general, IdeaLab which focuses on improving Wikipedia/Wikimedia as a whole. If the former, some major revisions need to be made in order to hone this Idea into something which could address the purported issue of the gender gap on Wikipedia and further the goals of the Inspire campaign. If the latter, it may be best to either move (or request a move, I'm not sure how this process works) this Idea to another IdeaLab.
  2. Regardless of which path you choose, this Idea needs more specification and clarification on the contents therein, as well as some revision. For example, the data proposed needs to be more specific, especially with respect to Items #1, #2, #4, #6, #7, #8, #9, and #10 (basically, not #3 and #5, though expansion on those could also be beneficial). The goals might also need revision if you choose to keep this Idea as one for the Inspire campaign, since they aren't specific to the campaign itself.
  3. Further elucidation on why and how this data could be meaningful or useful to either the Inspire campaign or Wikipedia as a whole may greatly improve the overall cogency of this Idea, since some may view this Idea as effectively useless and pointless to pursue.
  4. Some method of ensuring the data collected is accurate, or at least a caveat stating that the data may be inaccurate due to the relative anonymity of the participants, may be needed to establish the credibility and worth of the research.
  5. Preserving the anonymity of the participants, despite how this could detract from the research, should be paramount to respect the rights and identities of the participants. Whether this takes the form of confidentiality between the participants and the information they provide during the study, or not demanding that the participant must reveal his or her self and therefore could remain anonymous, or even that the information and proof they provide could somehow not tie back to their actual identity (if that's even possible), would be preferable. If you make identification non-mandatory, you could of course give greater weight to those who do identify themselves and/or verify their information as compared to those who remain anonymous.
    • This study would only be conducted after an IRB approval from an academic institution. My prior experience with maintaining confidentiality of participants in research studies would absolutely apply in this regard. Participants’ user names and data would be de-identified by the research team and participants would be given a randomized ID code for analyses.Cshanesimpson (talk) 13:27, 23 March 2015 (UTC)[reply]
    Thanks for clarifying that. It's definitely appreciated. –Nøkkenbuer (talkcontribs) 17:03, 23 March 2015 (UTC)[reply]
  6. Some further elucidation on how this study would be conducted could also help solidify this Idea. Would it be in the form of a fill-out survey anyone and everyone could take? If so, how would abuse of this system be prevented? Would it be more exclusive, such as how research studies do it, with a limited number of participants and/or a certain quota to fill before the actual data harvesting can be conducted? What would be the qualifications for this survey or study? Are there limited requirements, either as a person or as a community member, such as age requirements, or minimum identification requirements, or minimum activity or previous edits/contributions? Any further details on how this could all work, even if only in theory, would be beneficial both to the readers and prospecting endorsers, and to the grant providers if (and when?) you finally expand it into a grant proposal.

If I think of more, I will edit them in.

Final Remarks[edit]

If you made it this far, congratulations! You have withstood my verbose rambling! Maybe all this helped, or maybe it was a waste of my time. Feel free to respond with your thoughts below. No response is necessary, though, since everything I wished to mention is pretty much already stated above. The purpose of this post was, like I said, to hopefully improve this Idea and not beat it into the digital dirt. This is more food for thought than it is kindling for a fiery discussion, though I'm open to discuss this further if needed.

Although I'm not prepared to endorse this Idea, and may never will, consider this my contribution to your project.

Thanks for contributing to Wikipedia!

Nøkkenbuer (talkcontribs) 11:11, 23 March 2015 (UTC)[reply]

Eligibility confirmed, Inspire Campaign[edit]

This Inspire Grant proposal is under review!

We've confirmed your proposal is eligible for the Inspire Campaign review. Please feel free to ask questions and make changes to this proposal as discussions continue during this community comments period.

The committee's formal review begins on 6 April 2015, and grants will be announced at the end of April. See the schedule for more details.

Questions? Contact us at grants(at)

feedback and comments from Thepwnco[edit]

@Cshanesimpson: hello and congrats on your grant proposal being confirmed as eligible for review! I was wondering if you could provide some more information about your participants. On the grant page you wrote that you'd consider ~1,000 participants for each group a good sample size but I wasn't quite sure what you meant by "each group." Could you clarify please? Also, I don't believe it says anywhere on your grant what Wikipedias (e.g. English Wikipedia, German Wikipedia, etc.) you are specifically interested in studying. Seeing as you have budgeted for translation, maybe it is your intention to study as many/all Wikipedias as possible? cheers. -Thepwnco (talk) 20:14, 5 April 2015 (UTC)[reply]

  • Thank you Thepwnco for your feedback and I hope I can clarify a bit! The group categorizations were Super-Editor, Active Editor, and Inactive Editors, with the actual labels only meant to differentiate among three levels of editing volume that editors can engage in. We're hoping to use the list of Wikipedian editors that ranks the community members by the number of edits they've provided on Wikipedia. From this list, we'd aim to get a target sample size of 1,000 editors form each of these three "editing groups," which would likely result in us needing to post our survey link to 2,000 Talk Pages. Based on my prior experience with this sampling method and research design, I anticipate that we'll have a 50% attrition/no-response rate where we'd end up with our total sample size of 3,000 (n=1,000 per editing group). Cshanesimpson (talk) 13:15, 6 April 2015 (UTC)[reply]
  • In regards to your transcription question, you're correct that we had hoped to sample from the Wikipedia's beyond the English Wikipedia. As an exploratory study, I'd prefer to prioritize capturing an accurate representation of the editing volume (in the context of other variables) without limiting the study to editors that engage in only one of the many Wikipedia communities. I also know many Wikipedians choose to edit on more than one Wikipedia and I would like to capture that in the study. The transcription services budget is based on averages that I've worked with on other projects, representing the average cost of translation for more than one language in a fairly robust sample (like this one). Thank you again for your feedback so I can clarify these items in the actual grant proposal. Cshanesimpson (talk) 13:15, 6 April 2015 (UTC)[reply]

Questions from Superzerocool[edit]

Hi, thanks for your proposal. I have some questions:

  1. Do you look for "super editors" in other wikis (ie: Spanish Wikipedia)?.
  2. Which one could be the system, page or mechanism to complete the survey (ie: Google Docs, Survey Monkey, own script, etc)?
  3. About the translation services, which are these languages to be contracted?
  4. About the travel funding: Wikimania is clear but, why WMF should fund the other travel expenses?

Regards Superzerocool (talk) 17:25, 6 April 2015 (UTC)[reply]

  • Hello and thank you Superzerocool for your questions! Yes, I'd love to include the non-English Wikipedias such as the Spanish Wikipedia. The translation services were included in the budget specifically for this purpose and I'd like to provide the survey instrument to non-native english speakers. I anticipate that we'll encounter a few languages that require a significantly higher translation rate, but the budget numbers are relatively conservative for these services. I'd also planned to draw upon the resources that are available at my university (i.e. bilingual research assistants, university translation services). Cshanesimpson (talk) 20:00, 8 April 2015 (UTC)[reply]
  • In regards to your second question, my preference is towards Survey Monkey or Qualtrics as the survey provider since I've had reliability issues with Google Docs in the past. We would also prioritize the strict confidentiality of our participants in this project due to the nature of the questions. Both Survey Monkey and Qualtrics are fairly customizable and I've never experienced problems with confidentiality on these platforms. Cshanesimpson (talk) 20:00, 8 April 2015 (UTC)[reply]
  • I strongly believe that we should disseminate study results to a variety of communities in an effort to cast a larger net, especially given that potential Wikipedians and even current Wikipedians may not attend Wikimania. As a result of the APS Wikipedia Initiative launch, we would have an open window to Wikipedia-enthusiasts. I recently presented a project at the International APS conference and the attendees of the APS conferences often span a variety of disciplines and interest areas. As such, this conference attendance would provide us with an opportunity to disseminate our findings and engage in discussions about the results with an academic community outside of Wikimania. As a whole, I think that these conference-based discussions are often most productive for sharing results, moving projects further, and brainstorming about the "next steps" for projects (i.e. ideas about interventions). Cshanesimpson (talk) 20:00, 8 April 2015 (UTC)[reply]

Community notifications[edit]

Hey @Cshanesimpson:, congratulations on meeting the eligibility requirements for your grant application! One of the next steps is to make sure you have notified any community groups or on-wiki noticeboards relevant to your proposal. As your proposal involves research on a somewhat broad scale, there isn't necessarily one place that is best to notify. However, it looks like you will be involving English wikipedia - a good place there would be the Village Pump (Miscellaneous). There is also WikiProject Research, which is pretty inactive, but a post there couldn't hurt. Some other spots could be found in Category:Wikipedia resources for research A mailing list of interest is Wiki-research-l, if you aren't active there already. If you forsee your project working with any other language wikipedia, or some of the other projects, such as Commons or Wikisource, notifications to those projects would be good also. If you have any questions about this, let me know! Best, PEarley (WMF) (talk) 22:29, 8 April 2015 (UTC)[reply]

  • Thank you PEarley (WMF) for letting me know (and reminding me) about these communities! Another Wikipedian also recommended the listserv, so I've been following it, but haven't had a chance to jump into the conversation yet. I think this is the perfect opportunity to do so. Do you have any suggestions for reaching out to some of the other Wikipedia language communities in regards to bilingual posting? I hesitated to post in English on these sites, as English may not be users' primary language. Cshanesimpson (talk) 12:55, 12 April 2015 (UTC)[reply]

Feedback from Jmorgan (WMF)[edit]

Note: I am not a member of the Inspire committee, nor do I have deciding power over what gets funded. I was asked to comment on this and other research-related Inspire proposals by my grant officer colleagues.

Hi researchers. I appreciate the thought you've put into this proposal, I respect your expertise, and I thank you for your work on the Inspire Campaign. However, I have significant reservations about your grant proposal, and ultimately I do not think it should be funded by the committee. I provide my rationale (at great length, sorry) below.

Wikipedia (in English) is one of the most heavily studied online communities in history. People have been investigating who Wikipedians are, and what makes them 'tick', for over a decade. One area of research that has been particularly thoroughly covered is what motivates Wikipedians to edit Wikipedia. There have been dozens of studies of motivation on Wikipedia. Here are a few that I found through Google Scholar, just now:

There are plenty more here:

Most of the papers on motivation focus on the English Wikipedia, and many of them use surveys. Many other surveys have been used to collect demographic data about Wikipedia contributors. The Wikimedia Foundation itself is a major player in that game:

The number of surveys that have been deployed over the years has resulted in significant survey fatigue, especially on English Wikipedia (a good summary here).

That doesn't mean that there should never be another survey on Wikipedia, but researchers need to be extra respectful of editors' time. And the amount of disruption caused by a survey should be proportional to the potential benefit of the (real or perceived) disruption. There are some good best practices for survey research listed here.

Unfortunately, there is currently no reliable mechanism for an editor to explicitly "opt out" of all surveys. This means that blanket surveys (ones delivered to an editor simply because they edit, or edit a lot, not because they chose to participate in a specific program, like the Inspire campaign, or an Edit-a-thon) are often viewed by editors as especially obtrusive. And yet these are precisely the surveys that researchers most want to run: everyone wants a representative sample; everyone wants to capture the "zeitgeist" of editing, or the lived experience of editors, etc.

The survey described in this proposal is big, even by blanket survey standards. 6000 participants! And among them 2000 of the most active Wikipedians--the people who are most likely to be burned out on surveys already. Again, I'm not morally opposed to another editor survey, but after reviewing your plan I don't believe that this one merits the amount of disruption it causes. Below I list some of the specific issues I have considered that lead me to this conclusion.

  • Hello Jmorgan (WMF) and thank you for all of your feedback! I also view this process as a dialogue and feel that our grant proposal ideas only gain strength through conversations with the community. As such, I'd love to engage in more dialogue about some of these items. I completely agree that Wikipedians have been over-studied in regards to certain variables – motivations being only one – and I’m glad that the rest of the community recognizes the potential for participant fatigue as a result of these numerous (and most often) survey data collection methods. I also concur that the costs/benefits of risk and associated fatigue need to be weighed prior to any study implemented on Wikipedia, as would be reviewed extensively by any academic IRB approval (or at least the university I’m affiliated with). All of this being considered, I believe the need to clarify the results from prior studies using standardized measures with greater variation compared with the traditionally-binary or categorical approaches would assist us in better understanding the Gender Gap and how other variables may come into play with this phenomenon. In order to address/clarify your concerns, I’ll respond directly in-text (see below). Cshanesimpson (talk) 14:35, 25 April 2015 (UTC)[reply]

This survey covers a lot of the same ground (esp. re: motivation and demographics) as myriad surveys, interviews, and ethnographic investigations that came before it. Yet the researchers have not cited any of the relevant previous research in their proposal. In my view a thorough, evidence-based discussion of how their intersectionality of identities approach makes a substantial new contribution to our scientific understanding of the editing community (or of online communities more generally), is vitally necessary here, considering the size of the funding request and the potential for disruption.

  • You’re spot-on that this survey would cover a lot of ground, but in a way that furthers our understanding of identity development in considerations of online behaviors. However, I’m happy to add in significantly more of my offline literature review into the online grant proposal. This limited inclusion of prior literature to boost the rationale was based on my review of prior grants that had been funded. The basic limitations of the prior literature is that we’ve identified some of the motivating variables associated with editing behaviors (i.e. altruistic behaviors, web use), but gender (and other variables) have not been assessed in way that effectively describes how individuals’ varying identities “play together” in potentially unique patterns that lead to specific types and volumes of editing. I struggle the most with recent studies that have inherently viewed gender as “male, female, or other,” neglecting to consider contemporary understandings of masculinity and femininity as they can, and often do, co-exist. These notions of gender have been explored on other platforms (i.e. see A. Manago’s work with MySpace and Facebook), but not yet with a community that prides itself on building the accessibility of knowledge through effective dissemination strategies. These understandings serve to drive this project and it’s intentions in highlighting our understanding of identity and enactment of online editing behaviors in the context of the potentially influential variables – civic engagement (potential mediator), gender, ethnicity, etc. Cshanesimpson (talk) 14:35, 25 April 2015 (UTC)[reply]
  • As a quick side note – participants in my surveys always have an “opt-out” option and I don’t believe in forced-questioning on surveys based on my ethics training. As such, it’s up to the editor if they’d like to take the survey and also up to them as to which questions they feel comfortable answering. There would be no penalties for failure to complete the survey. Cshanesimpson (talk) 14:35, 25 April 2015 (UTC)[reply]

This survey focuses solely on the English Wikipedia. English Wikipedia is 14 years old. It is a large and fascinating community, but it's only one Wikimedia project among hundreds. And it's not exactly a growth industry at this point: the core community has been shrinking for more than half a decade, and it has been extensively studied my many people across may dimensions. The Wikimedia movement, and Wikipedia scholarship, can't get away with continuing to study English Wikipedia while ignoring the hundreds of growing--sometimes thriving, often struggling--Wikimedia projects that aren't in English. Not to mention the ones that aren't even encyclopedias! We don't know nearly as much about these projects or the people who contribute to them, and we can't assume that these communities, or the experiences, motivations, and identities of the people who participate in them, are anything like those of English Wikipedians. The future of Wikimedia, and open collaboration in general, is multilingual, multicultural and polymorphous in ways we don't currently comprehend. Every time we devote resources to another English Wikipedia survey, we are explicitly sidelining these projects and these people. Our English-wikipedia research focus is itself a potentially troubling source of, and contributor to, systemic bias.

  • I’m glad you mentioned the point about limiting the study to English Wikipedia, because this is one of the areas where my team and I have grown a bit. Based on feedback from other Wikipedians, the focus on English Wikipedia was chosen to ensure that we’re able to drawn strong conclusions about this community, without false representation. I completely agree that we need more research into our other Wikipedia communities, but as an exploratory study, this project wouldn’t have the capacity and resources to venture into those other communities… yet. I see that as a future extension that would be based on the results we’d potentially find in the current, smaller study. I also feel that n=3,000 (target) would not be sufficient for a larger investigation into the other Wikipedia communities. I've been and still am very open to exploring the non-english wikipedia communities, but would require a great deal of assistance with translation, which (based on my prior experience) is often quite costly. I'd be happy to discuss this piece a bit further. Cshanesimpson (talk) 14:35, 25 April 2015 (UTC)[reply]

This survey is only loosely focused on the gender gap. I accept the basic premise of intersectionality, but ultimately this study attempts to address so many different things that I fear it will yield relatively less insight into the experiences and motivations of non-male editors. Especially considering that, given the gender gap, the vast majority of the 3000-6000 respondents will be men.

  • As a mixed methods researcher, I view instrument development as an iterative process that requires piloting to assess for item clarity, instrument length, and alignment between item prompt and responses (content, construct, and face validity as interrelated). As such, this proposal explicitly includes a piloting phase that I’ve always used in my prior research. In addition to drawing upon my access to a widely diverse academic community (composed of 20+ universities), I’d planned to solicit a limited number of Wikipedia volunteers to provide feedback on the survey prior to wider distribution. My greatest concern is limitations that a smaller sample size might impose on the analytic plan - potentially nested modeling. With the large number of variables, we’ll need enough representation of the basic variations to effectively account for each variable in the models. I’ve been speaking with an experienced statistician from my university to also brainstorm smaller-sample models that could be used if we did not reach our target numbers. Since participation in the survey is up to the user (completely voluntary), it always remains an unknown as to how our sample size will grow and where we might encounter limitations (i.e. too few women/strong femininity). This is why the research assistants and the rest of my team are a vital component of this project. Data monitoring will be a huge, labor-intensive piece that will require many eyes reviewing the consistently growing data to identify sample biases and potential vandalism. You’re correct that a gender-biased study would not yield much in the way of usable data. However, it’s my hope that we could post to Talk Pages in a series of intervals, in which we could effectively monitor the gender variations we’re receiving (along the scale). This would allow us some control in ensuring that we have a matched-gender sample, in addition to other variables that might be most pertinent such as ethnicity or age. Cshanesimpson (talk) 14:35, 25 April 2015 (UTC)[reply]

I also have some methodological critiques:

Your sampling strategy needs to be re-thought, for two reasons.

  1. You can't just use the list of Wikipedians with the most lifetime edits, because many of those people are likely to have left the site years ago. Others may have been very active contributors for years, but are now only minimally active. So you'd be getting really inconsistent perspectives. Better to find the people who have been most active over the last ~6 months or so. You can gather these data with a simple SQL query, using the Quarry tool.
  2. The 20%/20%/20% buckets you're proposing are also not the right thresholds to use. Wikipedia editing operates according to a power law, so the "top 20%" of Enwiki contributors (for, say, February 2015) contains a few thousand people who make hundreds or thousands or edits a month, and a whole more people who make 5-10 edits a month. Very different users, very different motivations and characteristics. The next 20% will contain a roughly even split between 5-10 edit/month editors, and people who only ever edited once or twice (and are pretty unlikely to stick around). The third 20% contains almost entirely people who made a single edit and will never edit again--most of them probably won't even see your talk page message.
  • Thank you specifically for your suggestion about the Quarry tool! This is why I’ve added in a line item of the budget to get some assistance with this recruitment piece. “Activity” on Wikipedia is incredibly difficult to measure, which is why I’d included brief items in the survey that would double-check where the editor has been editing (via self-report of course) and whether they feel their “categorization” or these edits effectively represent how they edit. This allows editors to tell us if they took a month of from editing or how their editing has changed over time, which it likely has. I agree that the 20/20/20 doesn’t make sense and that’s a recent revision that I’ve discussed with my team, but haven’t yet revised on my grant proposal yet. I’ve modified this approach to keep the target sample sizes the same for each group, but the super-editor group would be sampled from the top “few thousand” that you identified and then the “whole more people who make 5-10 edits a month” would be sampled only if matched gendered samples were not obtained. It would very much be a top-down, and interval-based approach since this is the group that I feel should be most strongly protected in the research – they very much drive the work that we do with their contributions. The Active Editor group would then consist of those that have edited once or twice, as self-reported editing would be a requirement of their participation in the survey. The Passive User group has changed substantially, as I’m strongly considering the use of a mechanical turk to obtain a closely matched sample (based on gender first) as a comparison to the other two groups. These would be the users that passively use Wikipedia as an information-source, without editing. Cshanesimpson (talk) 14:35, 25 April 2015 (UTC)[reply]

You plan to store sensitive data indefinitely. You say that while the response data will be de-identified, "The core research team would keep the identifying information linking Wikipedia user name with survey responses." Keeping these sensitive data may conflict with the Foundation's own data retention guidelines. And even if it doesn't, it still worries me, given the comprehensive and sensitive nature of the data you propose to collect, that this PII will be retained in a non-anonymized fashion, by an external entity. Can you say more about why you would keep a copy of the survey response data that contains links to the usernames of respondents?

  • After reviewing my IRB protocols at my university, I should clarify the “plan to store data.” I typically retain my identified-data as a result of IRB requirements. However, in this type of survey my core research team and myself would de-identify the data and then destroy the links between user pages and demographic/general survey results. I no longer see a reason to keep this information on file once it’s been de-identified; this will be modified in the grant proposal as well. Cshanesimpson (talk) 14:35, 25 April 2015 (UTC)[reply]

This has been personally difficult for me to write, because I don't like to discourage well-intentioned and potentially impactful research. I'm a Wikipedia scholar myself, and I am deeply devoted to supporting Wikipedia scholarship, especially scholarship that sheds light on the gender gap and other systemic biases within our projects. And I trust that the researchers involved in this study proposal are similarly devoted to good scholarship and positive social change. But I don't believe that they yet have a handle on what question to ask, who to ask it to, or how to ask it. That said, I could potentially support this project for an Inspire Grant if:

  • it were focused more specifically on the gender gap (experiences of non-male-gendered editors)
  • it used a smaller and more targeted sampling strategy to minimize disruption,
  • the researchers could show how they build on previous research on editor motivation and demographics rather than duplicating it,
  • and (most importantly) if the study focused on comparative analysis across languages of Wikipedia

If I've misunderstood your proposal in any way, please correct me. I'm happy to answer questions about this lengthly critique at any time. I'm also happy to provide additional feedback, if you choose to revise your proposal. Sincerely, Jmorgan (WMF) (talk) 01:30, 23 April 2015 (UTC)[reply]

SRS-A, ethnicity and other issues[edit]

I agree with basically everything in the feedback from Jonathan above (and should likewise emphasize that I am not involved in the decision process for these grant proposals; I was invited to comment on this and other research-related Inspire proposals by my colleagues from the grant team).

E.g. the proposal contains some rather absolute statements like "it remains unknown" / "we don’t know" that would have benefited from at least a cursory literature review.

Also, Aaron Shaw raised some excellent points on the Wikiresearch-l mailing list earlier.

A few additional issues:

  • Social skills, autistic traits: This topic seems to be closely related to the team members areas of expertise and it seems worthwhile to explore (BTW, you might be interested in this planned Wikimania talk). According to the proposal, this part of the survey will be based on the standard "Social Responsiveness Scale- Adult (SRS-A)". However:
    • Per [1], the SRS-A contains 65 multiple-choice questions that take about 15-20 minutes to complete. Will this entire questionnaire be included in the survey? If not, does the literature about this instrument provide standardized ways to measure social skills or autistic traits based on only parts of the questionnaire?
    • What's more, the SRS-A questions (please correct me if the proposal is referring to a different version) are all in the third person (e.g. 32. "Has good personal hygiene." or 54. "Seems to react to people as if they are objects") and intended to be filled out by a parent, teacher or other caretaker, instead of the subject themselves. I'm curious (and again, this is based on the assumption that this is indeed the instrument referred to in the proposal) how this would be adapted to this survey. Is is planned to simply change third person into second person, e.g. in these two examples ask the surveyed Wikipedian "Do you have good personal hygiene?" and "Do you seem to react to people as if they are objects?"
      • This was a minor (but could be major) typo on my part. We'd planned to use the SRS-2, adult version of the autistic traits measure, not the third-person SRS-A. In reviewing the length of the survey as it currently stands, I agree that the measure is much too lengthy to include in full. Since the SRS-A has social and restricted interests and repetitive behaviors (RIRB) domains that have been validated in the prior literature it seems that a stronger option would be to pull the most relevant social domain - I believe to be social motivation - and the RIRB domain. This would significantly decrease our survey items to keep the measure more concise. Cshanesimpson (talk) 13:35, 26 April 2015 (UTC)[reply]
  • Ethnicity: According to the proposal, "Ethnicity categorizations would be based on the previously-validated census data response categories." It is not specified which census this refers to, but presumable it is that of the United States of America, considering that at least some team members seem to be based in that country. However, this survey will need to be an international one (the planned sampling method can't be restricted to US-based respondents), and there is no international standard for such ethnicity questions at all. In fact, questions that are entirely standard in the US, in particular about "race" - Caucasian, Asian, Hispanic... - can be extremely offensive in several European countries. See e.g. this mailing list thread, including my own posting here with a quote from a working paper of the International Social Survey Programme (ISSP) which basically concluded that it is impossible to come up with a consistent ethnicity variable for international surveys.
    • This is a fair comment and I think you're correct that ethnicity is incredibly difficult given the nature of our Wikipedia communities. After speaking with my research team about this variable it seems that a likely solution would be to request that the participant identify their geographic log-in location and pull out the ethnicity question. I've just modified this in the attached grant proposal. Cshanesimpson (talk) 13:35, 26 April 2015 (UTC)[reply]
  • Translations: The proposal says "see budget for translation services", but I can't find an item for translations there, or even just the anticipated number of languages that the survey will be available in. Previous experience with such surveys has shown that professional translations, even when coming from high-quality (and possibly pricey) translation companies, should be reviewed by community members or other people familiar with Wikimedia-specific concepts. Outside translators are likely to pick translations that do not match the terms established in the documentation and common lingo of the Wikipedia community in that language and are thus confusing for respondents.
  • Incentives/prizes: Yet another, smaller aspect that lets one suspect that the international nature of the survey population has not been considered sufficiently: The proposal devotes a significant part of the budget to iPads as raffle prizes, but seems to give no thought on the difficulties of shipping them internationally from the US, or of obtaining localized iPads from local vendors in arbitrary respondents' countries.
    • As a result of your comments and input from my team, I completely agree with this one. We'd need an electronic form of compensation. I've replaced the iPad's (which served as a placeholder) for Amazon gift cards, which I've used in prior studies. However, I'm very open to other forms of incentives that might serve as stronger rewards. Cshanesimpson (talk) 13:35, 26 April 2015 (UTC)[reply]
  • Open-ended questions: The proposal emphasizes the use of open-ended question several times, e.g. about editing frequency, ethnicity/gender, perceived barriers, suggested improvements etc. It sounds nice to give respondents that kind of freedom, but the downside is that open-ended questions are much harder to evaluate at scale than multiple-choice questions. What will you do with 3000 free-text responses in (say) seven languages to a single question that (as pointed out above) allows a wide range of topics in the answer? The proposal / budget doesn't say anything about (and might not be sufficient for) coding efforts.
    • Yes, the open-ended questions will be quite a daunting task, but I think the value of open-ended questions outweighs the work involved. As my team and I have run qualitative analyses based on open-ended from other large-scale online surveys (n=500+), I'm confident that we could complete the coding given that it was all-English response data. This would be included in the research assistants' line items in the budget, as I recognize that coding is often an incredibly time-consuming component of the analyses. Cshanesimpson (talk) 13:35, 26 April 2015 (UTC)[reply]
  • On the other hand, the budget item of $2000 for the sole task of posting the survey on talk pages seems extremely high (assuming the sample selection has been done already); there are existing automated tools which could be used for that.
    • This was one budget item that I'd strongly welcome feedback on, as the use of an automated posting bot/system is not an area within my expertise. As a necessary component of this study, I'd be happy to adjust this based on the complexity of this service. Cshanesimpson (talk) 13:35, 26 April 2015 (UTC)[reply]

Regards, Tbayer (WMF) (talk) 05:31, 25 April 2015 (UTC)[reply]

Dropping English Wikipedia Sample Sizes and Adding the Swedish Wikipedia[edit]

As a response to feedback from the community and suggestions for the use of a more cross-cultural approach (in addition to my personal cross-cultural interests), I'd love feedback on the use of a comparative Swedish Wikipedia sample. As Sweden is one of the most gender egalitarian in regards to both cultural beliefs and current politics, this might expand the work to include comparisons of editors in both communities. In addition, Swedish Wikipedia has one of the largest article counts which makes it a suitable comparison to data collected via the English Wikipedia. This inclusion would result in a decrease of sample sizes for the English Wikipedia (n=1,500 total) and would be replaced with those in the Swedish sample (n=1,500 total). However, data would still need to be matched based on scaled-gender scores to ensure representation from both men and women. Cshanesimpson (talk) 13:50, 26 April 2015 (UTC)[reply]

Aggregated feedback from the committee for Characterization of Editors on Wikipedia[edit]

Scoring rubric Score
(A) Impact potential
  • Does it have the potential to increase gender diversity in Wikimedia projects, either in terms of content, contributors, or both?
  • Does it have the potential for online impact?
  • Can it be sustained, scaled, or adapted elsewhere after the grant ends?
(B) Community engagement
  • Does it have a specific target community and plan to engage it often?
  • Does it have community support?
(C) Ability to execute
  • Can the scope be accomplished in the proposed timeframe?
  • Is the budget realistic/efficient ?
  • Do the participants have the necessary skills/experience?
(D) Measures of success
  • Are there both quantitative and qualitative measures of success?
  • Are they realistic?
  • Can they be measured?
Additional comments from the Committee:
  • Could increase understanding of a number of issues related to identity of contributors (gender, age, education level, etc); this approach in turn can broaden our understanding of the gender gap by taking a more holistic, intersectional approach to looking at the demographics of editors. Could help us understand more about contributors, identify barriers/challenges to contributing, and propose solutions
  • Appreciate that this grant address intersectionality and seeks to better understand the editing community across languages; less clear how this research will help improve the recruitment and retention of women though.
  • Not much community engagement outlined. Appreciate that the grantee acknowledges this and is receptive/seeks suggestions for how increase community participation. Essential that all of the survey results be shared with the community to further engagement, too.
  • The target audience seems large, vague, and there could be problems with translation if the group is not limited to a few languages. Would like to see more details about the type of people they plan to survey. Understand the desire to survey editors outside of just English wikis but would need to decide then how one then prioritizes which languages to translate to/from
  • Impressed by the thought the grantee has put into the proposal (best evidenced IMO by her conduct on the talk page) and happy to see Theredproject serving as the advisor.
  • Understand the workings of the community before taking on such a large survey would be important. Suggest that the team needs more experienced Wikipedians (and wiki-researchers) directly involved.
  • The grantee needs to ensure recruitment/completion of the survey by female-identified editors and other gender minorities. This may be a challenge if, for example, using Wikipedia editor rankings to contact survey participants.
  • Would like to see more time spent developing the survey instrument and selecting the people to be surveyed. The time frame seems rushed.
  • Suggest removing cost of the conferences. Most advisors/universities pay for their students to attend top conferences in their fields.
  • Reduce the participant incentives - iPad minis may not make most sense.
  • Would like measures of success to be more precise.

Inspire funding decision[edit]

This project has not been selected for an Inspire Grant at this time.

We love that you took the chance to creatively improve the Wikimedia movement. The committee has reviewed this proposal and not recommended it for funding, but we hope you'll continue to engage in the program. Please drop by the IdeaLab to share and refine future ideas!

Comments regarding this decision:
Intersectionality is an important aspect to consider and we understand the rationale for a large sample size in order to be able to draw meaningful conclusions from this type of research. Wel also appreciate how proactively you've engaged in discussions so far. However, a survey of this magnitude will need extra careful vetting, planning and understanding of the Wikipedia community before proceeding. We encourage you to consider the feedback offered during this process and from other Wikipedia researchers you’ve been in touch with. As your thinking continues to develop you’d be welcome to return in a future round with any adjustments or other ideas.

Next steps:

  1. Review the feedback provided on your proposal and to ask for any clarifications you need using this talk page.
  2. Visit the IdeaLab to continue developing this idea and share any new ideas you may have.
  3. To reapply with this project in the future, please make updates based on the feedback provided in this round before resubmitting it for review in a new round.
  4. Check the Individual Engagement Grant schedule for the next open call to submit proposals or the Project and Event Grant pages if your idea is to support expenses for offline events - we look forward to helping you apply for a grant in the future.
Questions? Contact us at grants(_AT_)