Research:Ideas/First edits for male and female newcomers

From Meta, a Wikimedia project coordination wiki


This page documents a proposed research project.
Information may be incomplete and may change before the project starts.

Past work by Lam et al.[1] suggests that

  • male and female editors focus on different content areas (based on 8 broad areas, women are more likely to edit People and Arts while men are more likely to edit Geography and Science)
  • women are more likely to edit contentious articles (based on edit-protection of the article) than men
  • early edits by women are more likely to be reverted than early edits by men
  • women and men are equally likely to react to early reversion by leaving Wikipedia

Lam et al. concluded that higher levels of women leaving Wikipedia after early edits is linked to their higher likelihood of their early edits being reverted, and not to women having a stronger reaction than men to being reverted.

In this study, we would explore more deeply the characteristics of articles that men and women first edit to better understand why women's early edits are more likely to be reverted, as this appears to be closely linked to their leaving Wikipedia.

Hypothesis: Women's early edits are more likely to involve articles subject to a higher level of scrutiny[edit]

Articles on Wikipedia are not subject to the same level of scrutiny. Article such as biographies of living people, high-quality articles (e.g., featured/good articles), articles with more active/watching editors are likely to have higher expectations of edit quality which may be difficult for a new editor to achieve. As a concrete example at the intersection of People and Arts, “celebrity” magazines/TV shows are known to be popular with women, which may lead to some women attempting their first edits on “celebrity” Wikipedia articles, which would be subject to the stricter en:WP:BLP policies and hence are more likely to result in reversion.

Methodology[edit]

There are many article characteristics that could be studied, but an initial experiment might be comparing the early edit histories of male/female/don't-know editors relative to articles directly/indirectly in the en:Category:Living People against those articles not in that category given Lam's evidence that people are more likely to be a subject of interest to women. This would test whether women were disproportionately attracted to Living People articles and to the extent that stricter BLP policies resulted in reversion of edits by men/women/DK. Ideally, the category test should be made on the version of the article at the time of the early edit under study, but a quick-and-dirty experiment based on category of the current version of the article (or at a particular snapshot in a dump) might suffice for a preliminary test but suffer from the limitation that articles can only transition from BLP to non-BLP but not vice versa.

Limitation: Identifying the gender of newcomers[edit]

The difficulty with studying early edits is that new editors are less likely to self-identify as male/female at that time, so studies of the edit histories of self-identifying male/female editors may be distorted by survivor bias.

Indicators of gender[edit]

We could get something like gender without relying on self-report; here are some things we could use instead.

  • Sorts of articles that editors edit
  • Grammar that editors use in edits
  • Genders listed on other web pages that seem to belong to the editor, such as Facebook profiles of identical handle
  • First name of the editor, if it's available somewhere, combined with some information about gender distributions by name.

To save time, I think we could ignore the actual gender and just use some statistic based on these instead.

But if we want to associate it with human-determined genders, we could compare ("validate") these indicators with gold-standard gender reports for a small pool of editors for which we are confident of the gender. For each user, we could thus come up with probabilities that ze is of any particular gender (a "probability distribution").

Tlevine (talk) 16:19, 2 September 2014 (UTC)

Support needed[edit]

  • Someone to gather data from Labs database


Ready to create a project page?


References[edit]

  1. Lam, S. T. K., Uduwage, A., Dong, Z., Sen, S., Musicant, D. R., Terveen, L., & Riedl, J. (2011, October). WP: clubhouse?: an exploration of Wikipedia's gender imbalance. In Proceedings of the 7th International Symposium on Wikis and Open Collaboration (pp. 1-10). ACM.