Experiment and control groups
The experiment and control groups are defined in New editor retention:
- Control A: 192 new editors who met the requirements to be included on the Teahouse invitee report but whose usernames were not included on that report, who were not invited through any other forum, and who did not find their way to the Teahouse on their own.
- Control B: 200 new editors who were invited through the feedback dashboard, the Teahouse invitee reports and the new editor contribs filter, but did not ask a question on the Teahouse Q&A page or create an introduction box on the Teahouse/Guests page.
- Experimental: 190 new editors who were invited through the feedback dashboard, the Teahouse invitee reports and the new editor contribs filter and who did edit the Teahouse/Questions and/or Teahouse/Guests pages.
It seems Control B were the ones that were invited but chose not to participate in the Teahouse activity.
Is Control A randomly selected? I see those who found the Teahouse on their own (and probably those who were invited outside the experiment proper) are excluded. How many where these? What does the "Teahouse invitee report" stand for? Was it a list from which hosts chose people to invite, or a post facto list?
- (reply). Control A was randomly selected sample from a larger set of editors whose usernames were intentionally excluded from the invitee report. As for people who found the Teahouse on their own, or were invited by editors, during the study period, the best answer I can give without digging deep into the data again is "not very many". The Teahouse Invitee report is a daily list of new accounts that meet a minimum number of edits in their first few days. More info here. Hosts chose which editors to invite. Jmorgan (WMF) (talk) 23:34, 16 October 2013 (UTC)
Does Experimental and Control B include all editors who were (randomly?) selected for the "Teahouse invitee report" or is there a dropout of some sort? Were people invited later (through the activity) included in the experimental group.
- (reply) Control B is a randomly-selected subsample of a larger set of people who were invited, but didn't show up. Experimental is the full set of people who were invited through the above channels and did show up--although since invites were manually tracked by editors for this analysis, this sample is probably under-reported. Jmorgan (WMF) (talk) 23:34, 16 October 2013 (UTC)
How did randomization of invitations through the new editor contribs filter work?
- (reply) A MySQL query of the form LIMIT BY [number] ORDER BY RAND() Jmorgan (WMF) (talk) 23:34, 16 October 2013 (UTC)
I am sorry if I missed the discussion of these three groups and how their selection influences the results.
- (reply) I'm happy you're asking! This was actually a preliminary analysis and the results are not certain. I did a more rigorous analysis during Phase 2 of the pilot, which was written up for publication here. The conference paper is probably the clearest presentation of the experimental design and results. Jmorgan (WMF) (talk) 23:34, 16 October 2013 (UTC)