Research talk:Editor Lifecycles

From Meta, a Wikimedia project coordination wiki

Some comments[edit]

First of all, let me say that I find these results quite interesting. Congrats, and keep up with the good work. This is an important topic of study for which we need additional insights.

Now, some questions/comments to (hopefully) improve the analysis and the report:

  • New users and first years: I already included a preliminary study of user retention in my PhD. thesis. One of the comments received at that time was that results could be biased to some extent, due to very long idle periods between consecutive edits, for "casual" users. Later on, a closer examination revealed that, for the largest languages, a significant subpopulation of sporadic contributors come back after 2 or more years of inactivity. Further tests of the former time-to-event analysis showed that the bias introduced is not alarming, but in any case I think it would be sensible to take this into account at least for the youngest cohorts (from 2011).
  • IMO, Figure 2 would be more meaningful if you display proportion (%) over total number of users in every year, rather than total scores. Right now, years of steady raise (2006 or 2007) clearly outnumber the previous years and could hide interesting patterns.
  • Model construction: is there any special motivation behind the choice of 750 seconds as an upper truncation threshold in your model?
  • In the same section: the model is quite interesting but, due to its inherent design, we still lack of information about time spent on edits in Main namespace (since the model explain all other namespaces relative to this). I can also see some link between the interpretation of the model and the original question about user retention and lifecycle, but it may be not apparent for other readers in its current form.
  • New Users and First Months: I'm not 100% sure, but I would say the graph is plotted using a natural rather than logarithmic scale. In that case, the apparent flat shape of years between 2001-2004 (included) may be hidden by the much larger scores in subsequent years. I suspect that these 4 years also show a clearly steady growing rate, though still in an preliminary stage.

HTH.--GlimmerPhoenix (talk) 17:47, 26 July 2012 (UTC)[reply]