Rcom Review of "Understanding the Editor/Bot Relationship"[edit]

First, as of the aims of the study, I find them attractive from a scientific point of view and certainly important for better understanding of Wikipedia development and the way our editors work. At the moment of writing of the following comments and suggestions I am not aware of any similar studies. I encourage Randall Livingstone to provide links to previous similar work here in case any such work may be found in the relevant journals, books, monographs or online.

Greetings Goran, and thank you very much for the review. Your comments and questions are helpful. I terms of previous studies on Wikipedia bot operators, I know that R. Stuart Geiger. at UC Berkeley has done some work in this area, and his studies offer the foundation for mine. UOJComm 20:25, 27 September 2011 (UTC)

Research Ethics & Study Material[edit]

As this research projects already complies to standard protocols in social sciences (IRB approval was received from the University of Oregon, protocol #08262011.107, 8/30/2011, as Randall Livingstone already kindly informed us), I will assume that only Wikimedia specific norms should be addressed. From the project description I do not see any potential problems here, but I would advise the researcher to send us at least the preliminary version of the study questionnaire (if the study will take a form of a systematic, standardized questionnaire) or an interview guide (if the study will use a non-standardized interview approach).

Study Sample[edit]

The project description lists three convenience samples but we are really talking about one study sample collected in three ways. As of the recruitment procedures in respect to the adequacy of the sample for this study, if they are fine with the principal researcher, they are fine with me too.

I think we need to learn about the planned sample size. Since it seems we are talking about unstructured interviews here (not standardized questionnaires and thus, I guess, no statistical estimation for any indicator), the sample size becomes the question of the researcher's own judgment. No prior advice can be given in respect to this. Again, it would be good to know the planned sample size in order for comparison of this study with previous similar efforts (if any are available; I didn't have the time to look up for similar studies, but I will do this very soon; if Randall finds some time for this he could help me by pointing to similar past studies that could be relevant for this one).

I am not sure if previous studies have actually "talked" to editors and programmers about their work, so my sample size is based on what is logistically reasonable for me to conduct. Of course, with any interviewing and social science research, data collection can be stopped once saturation is reached (i.e. interview answers stop presenting new information or point of view). UOJComm 20:25, 27 September 2011 (UTC)

Recruitment Method[edit]

As of the first convenience sample: "Editors may also be recruited through a limited number of posts on appropriate Wikipedia: Meetup Discussion pages."

I am fine with this, please note that the following advise is listed on Wikipedia:Meetup : "Don't manually post notices to all the editors who are listed in a geographical category".

Now, as of the second convenience sample: "... will be drawn from all Wikipedia editors involved in creating bots or assisted editing tools on the site (approximately 1,500), as well as past and present members of the Wikipedia: Bots Approval Group (approximately 50). Subjects will be solicited through a message on appropriate Discussion pages and a one-time invitation on their user talk page or via email if enabled on the user space (if an email is sent, only a "you've got mail" template will be posted to the user talk page). Ideally, interviews will be conducted with 5-10% of this population."

5-10% of the population of size 1,500 is 75 and 150 interviewees, respectively. But this is the planned, final sample size for this subsample, I guess. Which means many more than 150 e-mails will be send to those who have e-mails enabled on their talk pages. I am not sure whether the recruitment here should go for so many e-mails; maybe leaving one-time invitations on user talk pages could do the work. *** Please RCom members comment here - I am not really sure about the conclusion I am reaching here ***

In my previous research, I used the method of both emailing editors, and then leaving a post on their talk pages. That study was for a smaller population though (~350), so just leaving a one-time invitation on user talk pages for editors and programmers would likely produce a sample size large enough. I would be happy to make that change to the recruitment process. UOJComm 20:25, 27 September 2011 (UTC)
It would be very useful if you some how systematize the lessons you extracted from the Recruitment Methods (even just notes) and added to meta: Reserach (Other Rcom is there a specific place for that?), so others can have a better expectation on the base of your experience. --Lilaroja 12:12, 28 September 2011 (UTC)
Yes, great idea Lilaroja. This is my second project working with the RC on recruitment, so I could definitely share some of my experiences (on whatever page is most appropriate). UOJComm 02:59, 30 September 2011 (UTC)


"The study will involve both online and in-person interviews. Online interviews will be conducted using Skype video conferencing, Google Chat instant messaging, or an email client. Data will be collected, encrypted, and saved to my personal computer. Data from in-person interviews will be recorded using a digital audio recorder, then transferred, encrypted, and saved to my personal computer."

Ok, as of privacy concerns, the only question that comes to mind is: how will you the private data collected be anonymized so that it can be released publicly? I want to note again that it would be important for us to get insight into at least the preliminary questionnaire/interview guide when it is ready.

As this is a qualitative, social science study, I explain to potential interviewees in my invitation and consent procedures that the work is for public display in the form of publications, conference presentations, or reports for the Wikipedia community. I suggest that editors use their username, but I also offer the option of creating a pseudonym for the study. If the RC or others have additional ideas on how I should protect privacy, I welcome the suggestions.
My questions guide is available (I just need to figure out how to upload it here). UOJComm 20:25, 27 September 2011 (UTC)
There are wikipedians that do not use Skype. Perhaps you might also consider IRC or phone lines. Cheers! --Lilaroja 12:10, 28 September 2011 (UTC)
Another good suggestion. Really, I am willing to conduct the interviews using whatever technology the subject prefers...even in person! UOJComm 03:01, 30 September 2011 (UTC)

Hi Randall, thank you for sending your questionnaires. I have reviewed all questions and I see no problem there; truly, I did not expect any since the research is already approved by the relevant university's IRB (University of Oregon, protocol #08262011.107, 8/30/2011). I will ask other RCom members to take a quick look so that you can be ready to start recruiting your subjects at the upcoming WikiSym 2011. Good luck with your research! Goran S. Milovanovic 08:57, 28 September 2011 (UTC)


