Are you including discussions on talkpages other than those of the user? My experience is that some users have their first discussions on other editor's talkpages, article talkpages and other talkspaces. Remember that some "new" editors are experienced with mediawiki software or even Wikimedia projects either through IP editing, other wikimedia projects or others wikis like Wikia. WereSpielChequers 07:35, 9 June 2011 (UTC)

Thank you for your suggestion! That is a good point, which I have not thought about. I will include to my analysis the messages sent by new editors as well as those sent to new editors. --whym 17:40, 9 June 2011 (UTC)

## Regression Analysis

Hi Yusuke,

I've done a very simple linear regression analysis where I estimate the number of edits you will make one year from entering the community (edits_1yl). The predictor variables are:

• the duration between the first edit and first message (in days)
• which year you joined the community

I am running the model without a constant.

```Source	SS	df MS	Number of obs	= 4796
F( 3, 4793)	= 430.23
Model	 1501876.76	3 500625.587	Prob > F	= 0.0000
Residual 5577276.24	4793 1163.62951	R-squared	= 0.2122
Total	7079153	4796 1476.05359	        Root MSE	= 34.112

edits_1yl	Coef.	Std. Err. t	P>t	[95% Conf. Interval]

delta	       .0094013	.0018819 5.00	0.000	.0057118  .0130908
year	      -.0000553	.0002686 -0.21	0.837	-.0005819  .0004712
num_edits      .0638777	.0018391 34.73	0.000	.0602723  .0674831
```

Interpretation: The majority of your future productivity is explained by past productivity (num_edits). For every 16 edits that you make in t (year you entered), you will make 1 edit in the subsequent year. It does not matter for your productivity, which year you entered and there is actual a positive relationship on your future productivity and the time between your first edit and first message.

Caveats: This a first quick analysis,it's only a sample and many important predictor variables are missing.

Hi, I think this is interesting. However, while reviewing my dataset and scripts, I found a mistake in calculating the `num_edits` (edits in the first year) column. The columns was intended to have the number of edits in the first 12 months after the editor started editing, but I mistakenly summed the number of edits made between January and December in the year in which the editor started editing. I'll update the dataset with the correct values and hand it to you via the WSOR's Dropbox. I'll be able to provide the full dataset (more than 40 MB) in some other channels, if you are interested. --whym 00:33, 15 June 2011 (UTC)