CivilServant Initial Data Analysis For Community Outreach

For CivilServant's work with Wikipedians in multiple languages (outside English) to test machine learning based support for newcomers and gratitude systems, we are starting by having conversations with Wikipedians who can introduce us to the culture and needs of different language Wikipedias. Because we knew that not all languages would be large enough for the kind of A/B tests we're able to support, Julia, Max, and Nathan did some early analysis to identify the chance that we would be able to work with a given language. This page documents our process and our results as of July 2018.

Requirements for a Language Wikipedia to Be Part of A Study

As CivilServant reaches out to language Wikipedias about possible research, we want to make sure that we're respecting everyone's time by talking primarily to communities with the basic conditions for conducting an experiment. At the least, this means:

Newcomer study:
- The ORES machine learning system is available for that language
- The community has enough newcomer editors per month for a newcomer-welcoming experiment to be viable
- ORES identifies enough damaging and goodfaith edits from newcomers to be useful
Gratitude study:
- Thanks, Love, or both are available for that language
- The community has enough editors who haven't yet sent or received thanks & love

Collecting an Initial List of Wikipedias

To answer these questions, we did a review of language Wikipedias for the presence of the needed features. Here's what we compiled in late May 2018.

Initial Set of Language Wikipedias we Considered
Code	Language	ORES goodfaith	Thanks	Love
es	Spanish	yes	yes	yes
pt	Portuguese	yes	yes	yes
fa	Persian	yes	yes	yes
he	Hebrew	yes	yes	yes
ar	Arabic	yes	yes	yes
sv	Swedish	yes	yes	yes
hu	Hungarian	yes	yes	yes
tr	Turkish	yes	yes	yes
fr	French	yes	yes	no
ru	Russian	yes	yes	no
pl	Polish	yes	yes	no
nl	Dutch	yes	yes	no
cs	Czech	yes	yes	no
fi	Finnish	yes	yes	no
ro	Romanian	yes	yes	no
et	Estonian	yes	yes	no
uk	Ukrainian	no	yes	yes
ko	Korean	no	yes	yes
vi	Vietnamese	no	yes	yes
no	Norwegian	no	yes	yes
bn	Bengali	no	yes	yes
de	German	no	yes	no
it	Italian	no	yes	no
id	Indonesian	no	yes	no
el	Greek	no	yes	no
hr	Croatian	no	yes	no

Evaluating Statistical Power for Experiments with ORES, Thanks, and Love

Working from the subset of languages that included support for ORES goodfaith scores, we collected data about newcomers in November 2017 and the subsequent six months, using that data for power calculations to estimate, however imperfectly, the chance of observing a statistically-significant result of possible experiments over a six to eight month period this coming year. We based our power analyses on prior research involving welcoming newcomers^[1] and rewarding editors.^[2]^[3] Because we expect that communities will be co-developing experiment design with us, we see this as an indication of whether communities will have the flexibility they need to imagine something that works for them.

We're still consolidating our codebases, so you can see our code in the following places for now:

Data collection & preparation:
- ORES:
  - Querying Newcomers from Wikipedia Replicas (python)
  - Merging Newcomer Dataframes with ORES Scores (python)
- Gratitude:
  - Creating historical dataframes of thanks & love from multiple language Wikipedias (python)
Analysis:
- ORES:
  - Conduct prototype power analysis for Snuggle Outreach, per language (R)
- Gratitude:
  - We manually conducted power analyses using Alexander Coppock's power calculator at EGAP

Initial Language Wikipedias for Outreach

Based on this early data analysis, we developed a simple heuristic for our confidence that we might get enough statistical power for a study with a given language for a given study type, rating them high, medium, low, and unlikely.

Results of Power Analyses
Code	Language	ORES Confidence	Gratitude Confidence	Notes
es	Spanish	High	High
fr	French	High	High	Does not have Love currently
ru	Russian	High	High	Does not have Love currently
pt	Portuguese	Medium	High
pl	Polish	Medium	High	Does not have Love currently
ar	Arabic	Low	Low
fa	Persian	Low	Low
nl	Dutch	Unlikely	Low	Does not have Love currently
cs	Czech	Unlikely	Low	Does not have Love currently

If Your Language Is Not On this List

If your language Wikipedia is not on this list, it still may be possible to work with us, if there are enough people in your language Wikipedia who are eager to try community-led experiments. If your language receives ORES support in the next few months, we may be able to include you. Also, if a smaller language Wikipedia is expecting to run a substantial campaign or expects to have a large influx of newcomers for a predictable reason in the later part of 2018 and early 2019, we may be able to include you- just drop us a note!

References

↑ Morgan, J., & Halfaker, A. (2018). Evaluating the Impact of the Wikipedia Teahouse on Newcomer Retention. (preprint)
↑ Restivo, M., & Van De Rijt, A. (2012). Experimental study of informal rewards in peer production. PloS one, 7(3), e34358.
↑ Restivo, M., & van de Rijt, A. (2014). No praise without effort: experimental evidence on how rewards affect Wikipedia's contributor community. Information, Communication & Society, 17(4), 451-462.

[1] Morgan, J., & Halfaker, A. (2018). Evaluating the Impact of the Wikipedia Teahouse on Newcomer Retention. (preprint)

[2] Restivo, M., & Van De Rijt, A. (2012). Experimental study of informal rewards in peer production. PloS one, 7(3), e34358.

[3] Restivo, M., & van de Rijt, A. (2014). No praise without effort: experimental evidence on how rewards affect Wikipedia's contributor community. Information, Communication & Society, 17(4), 451-462.

[1]

[2]

[3]