Research talk:Teahouse long term new editor retention/Work log/2015-10-20
Add topicTuesday, October 20, 2015
[edit]Today I'm extending my analysis to include multivariate regressions. I'll be using a en:logistic regression to look for differences in the proportion of surviving newcomers. I'll be including the pre-invite statistics I worked on last time to control for random effects around the effect of invitation. I expect the effect invitation to become more prominent after controlling for these random effects. I also expect to see some interactions between either initial investment (# of edits pre-invite)/negative feedback and the invitation condition. A positive relationship would suggest that newcomers who are highly invested and get negative feedback gain more "survivalness" from the invite.
Checking for bucketing bias
[edit]The following list of wilcox and Chi^2 tests check for significant differences between the pre-invite predictors between conditions. Scalars are noted by quantiles (0%, 25%, 50%, 75% and 100%). Logicals by their proportion.
- edits control=5-6-7-11-215 invited=5-6-7-12-241 W=18158354.5 p=0.597
- main_edits control=0-4-6-9-212 invited=0-4-6-9-241 W=18327891 p=0.181
- talk_edits control=0-0-0-0-35 invited=0-0-0-0-37 W=17914856.5 p=0.188
- user_edits control=0-0-0-1-74 invited=0-0-0-1-232 W=17823485.5 p=0.163
- user_talk_edits control=0-0-0-0-24 invited=0-0-0-0-90 W=17958249 p=0.444
- wp_edits control=0-0-0-0-19 invited=0-0-0-0-79 W=18091557 p=0.581
- other_edits control=0-0-0-0-96 invited=0-0-0-0-125 W=18104004.5 p=0.552
- vandal_warning control=0.121 invited=0.123 X-squared=0.153 p=0.696
- spam_warning control=0.027 invited=0.026 X-squared=0.081 p=0.776
- copyright_warning control=0.003 invited=0.003 X-squared=0.304 p=0.582
- general_warning control=0.221 invited=0.214 X-squared=0.57 p=0.45
- block control=0.001 invited=0.002 X-squared=0.729 p=0.393
- welcome control=0 invited=0 X-squared=0 p=1
- csd control=0.035 invited=0.031 X-squared=1.234 p=0.267
- deletion control=0.051 invited=0.047 X-squared=0.696 p=0.404
- afc control=0 invited=0 X-squared=NaN p=NaN
- teahouse control=0 invited=0 X-squared=NaN p=NaN
No significant differences here.
Predicting 1+ edits
[edit]Now to build some logistic models that account for these pre-invite predictors.
- 3 to 4 weeks
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.40157 0.28915 -11.764 < 2e-16 ***
grpinvited -0.45260 0.31556 -1.434 0.151491
log(edits + 1) 0.36207 0.12951 2.796 0.005181 **
log(main_edits + 1) 0.09796 0.04883 2.006 0.044832 *
log(talk_edits + 1) 0.19515 0.07423 2.629 0.008563 **
log(user_edits + 1) 0.02817 0.04907 0.574 0.565918
log(user_talk_edits + 1) 0.22418 0.06657 3.368 0.000758 ***
log(wp_edits + 1) 0.13143 0.09053 1.452 0.146561
general_warningTRUE -0.52963 0.19096 -2.773 0.005547 **
csdTRUE -1.52286 0.72156 -2.111 0.034813 *
deletionTRUE -0.41958 0.37541 -1.118 0.263717
grpinvited:log(edits + 1) 0.23479 0.12669 1.853 0.063855 .
grpinvited:general_warningTRUE 0.04394 0.21149 0.208 0.835434
grpinvited:csdTRUE 1.09061 0.75507 1.444 0.148633
grpinvited:deletionTRUE -0.21884 0.42406 -0.516 0.605816
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 8869.9 on 14765 degrees of freedom
Residual deviance: 8564.5 on 14751 degrees of freedom
AIC: 8594.5
Number of Fisher Scoring iterations: 6
First, the obvious effects. We see the usual suspects here. The more edits you do -- overall, but especially talking -- the more likely you are to be retained. We also see some substantially negative effects of warning messages and CSD notifications.
We see a negative effect of invitation here, but it looks like the combined effect of grpinvited:log(edits + 1) counteracts that for editors who saved (log(x+1)=2, x=6) edits or more when the invite was posted. For any editor who saved more than 6 edits (highly motivated), it looks like the invite might be substantially improving retention at scale with how much editing they are doing. But the effect remains insignificant (marginal @ 0.064).
Counter to my suspicions, I don't think we're seeing solid evidence of an interaction between being invited to the teahouse and surviving despite negative feedback (csd & warning). It could be that this is due to too low of observations.
Just for the sake of making sure that my previous analysis wasn't totally off, let's try the model with just the invite as a predictor.
- 3 to 4 weeks (single predictor)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.44393 0.06633 -36.843 <2e-16 ***
grpinvited 0.14830 0.07369 2.012 0.0442 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 8869.9 on 14765 degrees of freedom
Residual deviance: 8865.7 on 14764 degrees of freedom
AIC: 8869.7
Number of Fisher Scoring iterations: 5
Sure enough. Getting the invite seems to look significant on its own. OK! Now to try the long-term retention outcomes.
- 1 to 2 months
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.38622 0.24595 -13.768 < 2e-16 ***
grpinvited 0.06769 0.27003 0.251 0.802051
log(edits + 1) 0.50057 0.11032 4.537 5.7e-06 ***
log(main_edits + 1) 0.14552 0.04364 3.335 0.000854 ***
log(talk_edits + 1) 0.17592 0.06710 2.622 0.008744 **
log(user_edits + 1) 0.05333 0.04358 1.224 0.220999
log(user_talk_edits + 1) 0.11024 0.06179 1.784 0.074370 .
log(wp_edits + 1) 0.07868 0.08374 0.940 0.347399
general_warningTRUE -0.50829 0.15991 -3.179 0.001480 **
csdTRUE -0.95765 0.46844 -2.044 0.040918 *
deletionTRUE -0.76776 0.35457 -2.165 0.030360 *
grpinvited:log(edits + 1) 0.01426 0.10815 0.132 0.895065
grpinvited:general_warningTRUE -0.10018 0.17865 -0.561 0.574961
grpinvited:csdTRUE 0.57811 0.50508 1.145 0.252384
grpinvited:deletionTRUE 0.23804 0.39010 0.610 0.541725
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 11229 on 14765 degrees of freedom
Residual deviance: 10845 on 14751 degrees of freedom
AIC: 10875
Number of Fisher Scoring iterations: 5
- 2 to 6 months
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.19216 0.23718 -13.459 < 2e-16 ***
grpinvited 0.08069 0.26070 0.310 0.75694
log(edits + 1) 0.43819 0.10755 4.074 4.61e-05 ***
log(main_edits + 1) 0.18349 0.04316 4.252 2.12e-05 ***
log(talk_edits + 1) 0.21654 0.06491 3.336 0.00085 ***
log(user_edits + 1) 0.03663 0.04303 0.851 0.39460
log(user_talk_edits + 1) 0.05446 0.06177 0.882 0.37797
log(wp_edits + 1) 0.11562 0.08142 1.420 0.15563
general_warningTRUE -0.53228 0.15243 -3.492 0.00048 ***
csdTRUE -0.55369 0.37929 -1.460 0.14435
deletionTRUE -0.69683 0.32396 -2.151 0.03148 *
grpinvited:log(edits + 1) 0.01059 0.10513 0.101 0.91977
grpinvited:general_warningTRUE -0.10878 0.17073 -0.637 0.52403
grpinvited:csdTRUE 0.04731 0.42470 0.111 0.91130
grpinvited:deletionTRUE 0.14502 0.36032 0.402 0.68732
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 11889 on 14765 degrees of freedom
Residual deviance: 11496 on 14751 degrees of freedom
AIC: 11526
Number of Fisher Scoring iterations: 5
Similar story here, but it doesn't seem like the effect of the invite isn't even marginally significant. Onto the 5+ measures.
Predicting 5+ edits
[edit]Same story as above except survival only counts when there's 5+ edits in the survival period.
- 3 to 4 weeks
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.65842 0.39320 -11.847 < 2e-16 ***
grpinvited -0.61702 0.42874 -1.439 0.15011
log(edits + 1) 0.44489 0.17306 2.571 0.01015 *
log(main_edits + 1) 0.20667 0.06855 3.015 0.00257 **
log(talk_edits + 1) 0.16150 0.10187 1.585 0.11289
log(user_edits + 1) 0.17094 0.06630 2.578 0.00993 **
log(user_talk_edits + 1) 0.21420 0.08913 2.403 0.01625 *
log(wp_edits + 1) 0.19019 0.11490 1.655 0.09787 .
general_warningTRUE -0.78168 0.30218 -2.587 0.00969 **
csdTRUE -1.42153 1.01750 -1.397 0.16239
deletionTRUE -0.29195 0.52487 -0.556 0.57806
grpinvited:log(edits + 1) 0.26699 0.16513 1.617 0.10591
grpinvited:general_warningTRUE 0.22127 0.33150 0.667 0.50447
grpinvited:csdTRUE 1.26795 1.05635 1.200 0.23001
grpinvited:deletionTRUE -0.48490 0.60548 -0.801 0.42322
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 5101.0 on 14765 degrees of freedom
Residual deviance: 4814.7 on 14751 degrees of freedom
AIC: 4844.7
Number of Fisher Scoring iterations: 7
Again, we see a lack of significant independent effect for the invitation. Again, we also see the marginially significant interaction with log(edits + 1) suggesting that the invitation might be more effective for newcomers who save a lot of edits before getting the invitation.
Onto the long-term outcomes:
- 1 to 2 months
Regression with multicollinearity problem
|
|---|
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.07573 0.32035 -12.723 < 2e-16 ***
grpinvited -0.40777 0.34891 -1.169 0.24253
log(edits + 1) 0.39092 0.14264 2.741 0.00613 **
log(main_edits + 1) 0.24877 0.05703 4.362 1.29e-05 ***
log(talk_edits + 1) 0.20831 0.08312 2.506 0.01221 *
log(user_edits + 1) 0.16667 0.05522 3.018 0.00254 **
log(user_talk_edits + 1) 0.18157 0.07557 2.403 0.01627 *
log(wp_edits + 1) 0.15650 0.09935 1.575 0.11521
general_warningTRUE -0.55499 0.22216 -2.498 0.01249 *
csdTRUE -12.75334 135.48352 -0.094 0.92500
deletionTRUE -1.16388 0.59382 -1.960 0.05000 *
grpinvited:log(edits + 1) 0.21278 0.13657 1.558 0.11923
grpinvited:general_warningTRUE -0.07856 0.24671 -0.318 0.75015
grpinvited:csdTRUE 12.36522 135.48374 0.091 0.92728
grpinvited:deletionTRUE 0.43928 0.63691 0.690 0.49038
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 7482.1 on 14765 degrees of freedom
Residual deviance: 7085.4 on 14751 degrees of freedom
AIC: 7115.4
Number of Fisher Scoring iterations: 14
|
Yikes! here, we're seeing too much correlation between getting a 'csd' message and being invited. Going to need to drop the predictor.
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.07257 0.31851 -12.786 < 2e-16 ***
grpinvited -0.41388 0.34713 -1.192 0.23315
log(edits + 1) 0.39039 0.14205 2.748 0.00599 **
log(main_edits + 1) 0.24097 0.05671 4.249 2.14e-05 ***
log(talk_edits + 1) 0.18318 0.08288 2.210 0.02710 *
log(user_edits + 1) 0.15665 0.05482 2.858 0.00427 **
log(user_talk_edits + 1) 0.17481 0.07540 2.318 0.02043 *
log(wp_edits + 1) 0.15835 0.09922 1.596 0.11051
general_warningTRUE -0.62899 0.22163 -2.838 0.00454 **
deletionTRUE -1.28957 0.59291 -2.175 0.02963 *
grpinvited:log(edits + 1) 0.22144 0.13593 1.629 0.10329
grpinvited:general_warningTRUE -0.02041 0.24601 -0.083 0.93387
grpinvited:deletionTRUE 0.52833 0.63556 0.831 0.40581
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 7482.1 on 14765 degrees of freedom
Residual deviance: 7100.9 on 14753 degrees of freedom
AIC: 7126.9
Number of Fisher Scoring iterations: 6
- 2 to 6 months
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.313717 0.293284 -14.708 < 2e-16 ***
grpinvited 0.363524 0.320604 1.134 0.256848
log(edits + 1) 0.672387 0.128262 5.242 1.59e-07 ***
log(main_edits + 1) 0.160990 0.051039 3.154 0.001609 **
log(talk_edits + 1) 0.249009 0.074702 3.333 0.000858 ***
log(user_edits + 1) -0.002232 0.051036 -0.044 0.965120
log(user_talk_edits + 1) 0.047366 0.073726 0.642 0.520577
log(wp_edits + 1) 0.166464 0.092988 1.790 0.073429 .
general_warningTRUE -0.731064 0.209830 -3.484 0.000494 ***
deletionTRUE -0.977154 0.467432 -2.090 0.036575 *
grpinvited:log(edits + 1) -0.089354 0.124897 -0.715 0.474347
grpinvited:general_warningTRUE -0.033336 0.232662 -0.143 0.886070
grpinvited:deletionTRUE 0.286314 0.510743 0.561 0.575081
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 8502.0 on 14765 degrees of freedom
Residual deviance: 8129.1 on 14753 degrees of freedom
AIC: 8155.1
Number of Fisher Scoring iterations: 6
Well, the direction and scale of the coefs don't change. We don't see independent significance in the effect of the invitation or it's interaction with previous activity.
Again, just to check my sanity, let's try the 2 to 6 month regression with the bucket as the single predictor.
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.52590 0.06867 -36.78 <2e-16 ***
grpinvited 0.16681 0.07617 2.19 0.0285 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 8502 on 14765 degrees of freedom
Residual deviance: 8497 on 14764 degrees of freedom
AIC: 8501
Number of Fisher Scoring iterations: 5
Sure enough, there's the significant effect I saw in the simple Chi^2 test. --Halfak (WMF) (talk) 18:54, 20 October 2015 (UTC)