Research talk:Ignored period and retention

From Meta, a Wikimedia project coordination wiki

Are you including discussions on talkpages other than those of the user? My experience is that some users have their first discussions on other editor's talkpages, article talkpages and other talkspaces. Remember that some "new" editors are experienced with mediawiki software or even Wikimedia projects either through IP editing, other wikimedia projects or others wikis like Wikia. WereSpielChequers 07:35, 9 June 2011 (UTC)[reply]

Thank you for your suggestion! That is a good point, which I have not thought about. I will include to my analysis the messages sent by new editors as well as those sent to new editors. --whym 17:40, 9 June 2011 (UTC)[reply]

Regression Analysis[edit]

Hi Yusuke,

I've done a very simple linear regression analysis where I estimate the number of edits you will make one year from entering the community (edits_1yl). The predictor variables are:

  • the duration between the first edit and first message (in days)
  • which year you joined the community
  • the number of edits you made in your first year

I am running the model without a constant.

Source	SS	df MS	Number of obs	= 4796
		        F( 3, 4793)	= 430.23	
Model	 1501876.76	3 500625.587	Prob > F	= 0.0000
Residual 5577276.24	4793 1163.62951	R-squared	= 0.2122
		                        Adj R-squared	= 0.2117	
Total	7079153	4796 1476.05359	        Root MSE	= 34.112
				
edits_1yl	Coef.	Std. Err. t	P>t	[95% Conf. Interval]
					
delta	       .0094013	.0018819 5.00	0.000	.0057118  .0130908
year	      -.0000553	.0002686 -0.21	0.837	-.0005819  .0004712
num_edits      .0638777	.0018391 34.73	0.000	.0602723  .0674831

Interpretation: The majority of your future productivity is explained by past productivity (num_edits). For every 16 edits that you make in t (year you entered), you will make 1 edit in the subsequent year. It does not matter for your productivity, which year you entered and there is actual a positive relationship on your future productivity and the time between your first edit and first message.

Caveats: This a first quick analysis,it's only a sample and many important predictor variables are missing.

Hi, I think this is interesting. However, while reviewing my dataset and scripts, I found a mistake in calculating the num_edits (edits in the first year) column. The columns was intended to have the number of edits in the first 12 months after the editor started editing, but I mistakenly summed the number of edits made between January and December in the year in which the editor started editing. I'll update the dataset with the correct values and hand it to you via the WSOR's Dropbox. I'll be able to provide the full dataset (more than 40 MB) in some other channels, if you are interested. --whym 00:33, 15 June 2011 (UTC)[reply]

bad v good edits[edit]

In July and August 2007 Cluebot and CorenSearchBot went live, Cluebot reverts vandalism and may merely have automated and slightly speeded up what was happening already (though it has botlike efficiency at warning the vandals). But corensearch bot automates the finding of copyright violations, and before it went live I doubt we were anything like as quick at spotting them. So what we might be measuring is partly that we have become more efficient at templating problematic editors. Whether we are more efficient at retraining them and getting them to edit productively is altogether different question. Just to complicate things further remember that the ratio of good faith and badfaith newbies may be altering over time. Theoretically levels of vandalism may be increasing in proportion to readership, spam may well also be rising linked to perception of Wikipedias audience reach, and as total editing is stable or slightly declining goodfaith editing by newbies is almost certainly on the wane. WereSpielChequers 21:57, 15 June 2011 (UTC)[reply]

About your second point: in our (highly unscientific) preliminary study, Steven and I found that the percentage of good faith newbies has remained fairly consistent over the years. And, in terms of absolute numbers (due to the huge increase in new users overall since 2004), the number of good faith newbies in years when vandalism was a huge problem (~2007-now) was much greater than when vandalism was virtually nonexistent (2004-5). Buickmackane 23:01, 15 June 2011 (UTC)[reply]
Thanks, I don't read that graph quite the same way. I think it shows spam as very small but steadily growing, vandalism and sockpuppetry growing until a couple of years ago and then dropping back somewhat and good faith starting at 100% in 2004, dropping below 70% in 2009 and then recovering to 75% in the present day. I think the decline in vandalism correlates with the arrival of improved edit filters, bot reversion of vandalism and possibly More school blocks - either by us or by the schools. WereSpielChequers 15:13, 20 June 2011 (UTC)[reply]

Single User Login[edit]

I think you might get a very different pattern according to whether this is an editor's home wiki. This is particularly an issue for EN wiki which has eubstantial overlaps with editors on most if not all other projects. An active editor on French, Tamil or Afrikaans Wikipedia might visit EN wiki occasionally, not least becasue some things which should have migrated to meta such as the spam filter and the bot requests page are still on EN wiki. Such editors might not consider themselves ignored, especially if their first edit was to create a user page saying who they are, which project they are visiting from and even encouraging correpondents to leave messages on their takpage on their home wiki rather than EN wiki. WereSpielChequers 11:08, 30 June 2011 (UTC)[reply]

Negative ignored period.[edit]

10% of editors get a message on their talkpage before the earliest edit of theirs that has not yet been deleted. Rather than exclude such editors I would suggest extending the survey to deleted edits. I suspect the vast majority of these editors will be newbies whose first edit was to create a page that subsequently has been deleted. Another group will be alternate accounts and doppelgangers where an exerienced editor creates accounts in varius slight variants of their name in order to prevent someone creating such accounts and using them to impersonate a known editor. Typically the first edit there will be from the main account saying "this is me when editing from insecure venues, or this is my account for bot editing". WereSpielChequers 11:41, 30 June 2011 (UTC)[reply]