Research talk:Teahouse long term new editor retention/Work log/2015-10-20

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

Tuesday, October 20, 2015[edit]

Today I'm extending my analysis to include multivariate regressions. I'll be using a en:logistic regression to look for differences in the proportion of surviving newcomers. I'll be including the pre-invite statistics I worked on last time to control for random effects around the effect of invitation. I expect the effect invitation to become more prominent after controlling for these random effects. I also expect to see some interactions between either initial investment (# of edits pre-invite)/negative feedback and the invitation condition. A positive relationship would suggest that newcomers who are highly invested and get negative feedback gain more "survivalness" from the invite.

Checking for bucketing bias[edit]

The following list of wilcox and Chi^2 tests check for significant differences between the pre-invite predictors between conditions. Scalars are noted by quantiles (0%, 25%, 50%, 75% and 100%). Logicals by their proportion.

  • edits control=5-6-7-11-215 invited=5-6-7-12-241 W=18158354.5 p=0.597
  • main_edits control=0-4-6-9-212 invited=0-4-6-9-241 W=18327891 p=0.181
  • talk_edits control=0-0-0-0-35 invited=0-0-0-0-37 W=17914856.5 p=0.188
  • user_edits control=0-0-0-1-74 invited=0-0-0-1-232 W=17823485.5 p=0.163
  • user_talk_edits control=0-0-0-0-24 invited=0-0-0-0-90 W=17958249 p=0.444
  • wp_edits control=0-0-0-0-19 invited=0-0-0-0-79 W=18091557 p=0.581
  • other_edits control=0-0-0-0-96 invited=0-0-0-0-125 W=18104004.5 p=0.552
  • vandal_warning control=0.121 invited=0.123 X-squared=0.153 p=0.696
  • spam_warning control=0.027 invited=0.026 X-squared=0.081 p=0.776
  • copyright_warning control=0.003 invited=0.003 X-squared=0.304 p=0.582
  • general_warning control=0.221 invited=0.214 X-squared=0.57 p=0.45
  • block control=0.001 invited=0.002 X-squared=0.729 p=0.393
  • welcome control=0 invited=0 X-squared=0 p=1
  • csd control=0.035 invited=0.031 X-squared=1.234 p=0.267
  • deletion control=0.051 invited=0.047 X-squared=0.696 p=0.404
  • afc control=0 invited=0 X-squared=NaN p=NaN
  • teahouse control=0 invited=0 X-squared=NaN p=NaN

No significant differences here.

Predicting 1+ edits[edit]

Now to build some logistic models that account for these pre-invite predictors.

3 to 4 weeks
Coefficients:
                               Estimate Std. Error z value Pr(>|z|)    
(Intercept)                    -3.40157    0.28915 -11.764  < 2e-16 ***
grpinvited                     -0.45260    0.31556  -1.434 0.151491    
log(edits + 1)                  0.36207    0.12951   2.796 0.005181 ** 
log(main_edits + 1)             0.09796    0.04883   2.006 0.044832 *  
log(talk_edits + 1)             0.19515    0.07423   2.629 0.008563 ** 
log(user_edits + 1)             0.02817    0.04907   0.574 0.565918    
log(user_talk_edits + 1)        0.22418    0.06657   3.368 0.000758 ***
log(wp_edits + 1)               0.13143    0.09053   1.452 0.146561    
general_warningTRUE            -0.52963    0.19096  -2.773 0.005547 ** 
csdTRUE                        -1.52286    0.72156  -2.111 0.034813 *  
deletionTRUE                   -0.41958    0.37541  -1.118 0.263717    
grpinvited:log(edits + 1)       0.23479    0.12669   1.853 0.063855 .  
grpinvited:general_warningTRUE  0.04394    0.21149   0.208 0.835434    
grpinvited:csdTRUE              1.09061    0.75507   1.444 0.148633    
grpinvited:deletionTRUE        -0.21884    0.42406  -0.516 0.605816    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 8869.9  on 14765  degrees of freedom
Residual deviance: 8564.5  on 14751  degrees of freedom
AIC: 8594.5

Number of Fisher Scoring iterations: 6

First, the obvious effects. We see the usual suspects here. The more edits you do -- overall, but especially talking -- the more likely you are to be retained. We also see some substantially negative effects of warning messages and CSD notifications.

We see a negative effect of invitation here, but it looks like the combined effect of grpinvited:log(edits + 1) counteracts that for editors who saved (log(x+1)=2, x=6) edits or more when the invite was posted. For any editor who saved more than 6 edits (highly motivated), it looks like the invite might be substantially improving retention at scale with how much editing they are doing. But the effect remains insignificant (marginal @ 0.064).

Counter to my suspicions, I don't think we're seeing solid evidence of an interaction between being invited to the teahouse and surviving despite negative feedback (csd & warning). It could be that this is due to too low of observations.

Just for the sake of making sure that my previous analysis wasn't totally off, let's try the model with just the invite as a predictor.

3 to 4 weeks (single predictor)
Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -2.44393    0.06633 -36.843   <2e-16 ***
grpinvited   0.14830    0.07369   2.012   0.0442 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 8869.9  on 14765  degrees of freedom
Residual deviance: 8865.7  on 14764  degrees of freedom
AIC: 8869.7

Number of Fisher Scoring iterations: 5

Sure enough. Getting the invite seems to look significant on its own. OK! Now to try the long-term retention outcomes.

1 to 2 months
Coefficients:
                               Estimate Std. Error z value Pr(>|z|)    
(Intercept)                    -3.38622    0.24595 -13.768  < 2e-16 ***
grpinvited                      0.06769    0.27003   0.251 0.802051    
log(edits + 1)                  0.50057    0.11032   4.537  5.7e-06 ***
log(main_edits + 1)             0.14552    0.04364   3.335 0.000854 ***
log(talk_edits + 1)             0.17592    0.06710   2.622 0.008744 ** 
log(user_edits + 1)             0.05333    0.04358   1.224 0.220999    
log(user_talk_edits + 1)        0.11024    0.06179   1.784 0.074370 .  
log(wp_edits + 1)               0.07868    0.08374   0.940 0.347399    
general_warningTRUE            -0.50829    0.15991  -3.179 0.001480 ** 
csdTRUE                        -0.95765    0.46844  -2.044 0.040918 *  
deletionTRUE                   -0.76776    0.35457  -2.165 0.030360 *  
grpinvited:log(edits + 1)       0.01426    0.10815   0.132 0.895065    
grpinvited:general_warningTRUE -0.10018    0.17865  -0.561 0.574961    
grpinvited:csdTRUE              0.57811    0.50508   1.145 0.252384    
grpinvited:deletionTRUE         0.23804    0.39010   0.610 0.541725    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 11229  on 14765  degrees of freedom
Residual deviance: 10845  on 14751  degrees of freedom
AIC: 10875

Number of Fisher Scoring iterations: 5
2 to 6 months
Coefficients:
                               Estimate Std. Error z value Pr(>|z|)    
(Intercept)                    -3.19216    0.23718 -13.459  < 2e-16 ***
grpinvited                      0.08069    0.26070   0.310  0.75694    
log(edits + 1)                  0.43819    0.10755   4.074 4.61e-05 ***
log(main_edits + 1)             0.18349    0.04316   4.252 2.12e-05 ***
log(talk_edits + 1)             0.21654    0.06491   3.336  0.00085 ***
log(user_edits + 1)             0.03663    0.04303   0.851  0.39460    
log(user_talk_edits + 1)        0.05446    0.06177   0.882  0.37797    
log(wp_edits + 1)               0.11562    0.08142   1.420  0.15563    
general_warningTRUE            -0.53228    0.15243  -3.492  0.00048 ***
csdTRUE                        -0.55369    0.37929  -1.460  0.14435    
deletionTRUE                   -0.69683    0.32396  -2.151  0.03148 *  
grpinvited:log(edits + 1)       0.01059    0.10513   0.101  0.91977    
grpinvited:general_warningTRUE -0.10878    0.17073  -0.637  0.52403    
grpinvited:csdTRUE              0.04731    0.42470   0.111  0.91130    
grpinvited:deletionTRUE         0.14502    0.36032   0.402  0.68732    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 11889  on 14765  degrees of freedom
Residual deviance: 11496  on 14751  degrees of freedom
AIC: 11526

Number of Fisher Scoring iterations: 5

Similar story here, but it doesn't seem like the effect of the invite isn't even marginally significant. Onto the 5+ measures.

Predicting 5+ edits[edit]

Same story as above except survival only counts when there's 5+ edits in the survival period.

3 to 4 weeks
Coefficients:
                               Estimate Std. Error z value Pr(>|z|)    
(Intercept)                    -4.65842    0.39320 -11.847  < 2e-16 ***
grpinvited                     -0.61702    0.42874  -1.439  0.15011    
log(edits + 1)                  0.44489    0.17306   2.571  0.01015 *  
log(main_edits + 1)             0.20667    0.06855   3.015  0.00257 ** 
log(talk_edits + 1)             0.16150    0.10187   1.585  0.11289    
log(user_edits + 1)             0.17094    0.06630   2.578  0.00993 ** 
log(user_talk_edits + 1)        0.21420    0.08913   2.403  0.01625 *  
log(wp_edits + 1)               0.19019    0.11490   1.655  0.09787 .  
general_warningTRUE            -0.78168    0.30218  -2.587  0.00969 ** 
csdTRUE                        -1.42153    1.01750  -1.397  0.16239    
deletionTRUE                   -0.29195    0.52487  -0.556  0.57806    
grpinvited:log(edits + 1)       0.26699    0.16513   1.617  0.10591    
grpinvited:general_warningTRUE  0.22127    0.33150   0.667  0.50447    
grpinvited:csdTRUE              1.26795    1.05635   1.200  0.23001    
grpinvited:deletionTRUE        -0.48490    0.60548  -0.801  0.42322    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 5101.0  on 14765  degrees of freedom
Residual deviance: 4814.7  on 14751  degrees of freedom
AIC: 4844.7

Number of Fisher Scoring iterations: 7

Again, we see a lack of significant independent effect for the invitation. Again, we also see the marginially significant interaction with log(edits + 1) suggesting that the invitation might be more effective for newcomers who save a lot of edits before getting the invitation.

Onto the long-term outcomes:

1 to 2 months
Regression with multicollinearity problem
Coefficients:
                                Estimate Std. Error z value Pr(>|z|)    
(Intercept)                     -4.07573    0.32035 -12.723  < 2e-16 ***
grpinvited                      -0.40777    0.34891  -1.169  0.24253    
log(edits + 1)                   0.39092    0.14264   2.741  0.00613 ** 
log(main_edits + 1)              0.24877    0.05703   4.362 1.29e-05 ***
log(talk_edits + 1)              0.20831    0.08312   2.506  0.01221 *  
log(user_edits + 1)              0.16667    0.05522   3.018  0.00254 ** 
log(user_talk_edits + 1)         0.18157    0.07557   2.403  0.01627 *  
log(wp_edits + 1)                0.15650    0.09935   1.575  0.11521    
general_warningTRUE             -0.55499    0.22216  -2.498  0.01249 *  
csdTRUE                        -12.75334  135.48352  -0.094  0.92500    
deletionTRUE                    -1.16388    0.59382  -1.960  0.05000 *  
grpinvited:log(edits + 1)        0.21278    0.13657   1.558  0.11923    
grpinvited:general_warningTRUE  -0.07856    0.24671  -0.318  0.75015    
grpinvited:csdTRUE              12.36522  135.48374   0.091  0.92728    
grpinvited:deletionTRUE          0.43928    0.63691   0.690  0.49038    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 7482.1  on 14765  degrees of freedom
Residual deviance: 7085.4  on 14751  degrees of freedom
AIC: 7115.4

Number of Fisher Scoring iterations: 14

Yikes! here, we're seeing too much correlation between getting a 'csd' message and being invited. Going to need to drop the predictor.

Coefficients:
                               Estimate Std. Error z value Pr(>|z|)    
(Intercept)                    -4.07257    0.31851 -12.786  < 2e-16 ***
grpinvited                     -0.41388    0.34713  -1.192  0.23315    
log(edits + 1)                  0.39039    0.14205   2.748  0.00599 ** 
log(main_edits + 1)             0.24097    0.05671   4.249 2.14e-05 ***
log(talk_edits + 1)             0.18318    0.08288   2.210  0.02710 *  
log(user_edits + 1)             0.15665    0.05482   2.858  0.00427 ** 
log(user_talk_edits + 1)        0.17481    0.07540   2.318  0.02043 *  
log(wp_edits + 1)               0.15835    0.09922   1.596  0.11051    
general_warningTRUE            -0.62899    0.22163  -2.838  0.00454 ** 
deletionTRUE                   -1.28957    0.59291  -2.175  0.02963 *  
grpinvited:log(edits + 1)       0.22144    0.13593   1.629  0.10329    
grpinvited:general_warningTRUE -0.02041    0.24601  -0.083  0.93387    
grpinvited:deletionTRUE         0.52833    0.63556   0.831  0.40581    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 7482.1  on 14765  degrees of freedom
Residual deviance: 7100.9  on 14753  degrees of freedom
AIC: 7126.9

Number of Fisher Scoring iterations: 6
2 to 6 months
Coefficients:
                                Estimate Std. Error z value Pr(>|z|)    
(Intercept)                    -4.313717   0.293284 -14.708  < 2e-16 ***
grpinvited                      0.363524   0.320604   1.134 0.256848    
log(edits + 1)                  0.672387   0.128262   5.242 1.59e-07 ***
log(main_edits + 1)             0.160990   0.051039   3.154 0.001609 ** 
log(talk_edits + 1)             0.249009   0.074702   3.333 0.000858 ***
log(user_edits + 1)            -0.002232   0.051036  -0.044 0.965120    
log(user_talk_edits + 1)        0.047366   0.073726   0.642 0.520577    
log(wp_edits + 1)               0.166464   0.092988   1.790 0.073429 .  
general_warningTRUE            -0.731064   0.209830  -3.484 0.000494 ***
deletionTRUE                   -0.977154   0.467432  -2.090 0.036575 *  
grpinvited:log(edits + 1)      -0.089354   0.124897  -0.715 0.474347    
grpinvited:general_warningTRUE -0.033336   0.232662  -0.143 0.886070    
grpinvited:deletionTRUE         0.286314   0.510743   0.561 0.575081    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 8502.0  on 14765  degrees of freedom
Residual deviance: 8129.1  on 14753  degrees of freedom
AIC: 8155.1

Number of Fisher Scoring iterations: 6

Well, the direction and scale of the coefs don't change. We don't see independent significance in the effect of the invitation or it's interaction with previous activity.

Again, just to check my sanity, let's try the 2 to 6 month regression with the bucket as the single predictor.

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -2.52590    0.06867  -36.78   <2e-16 ***
grpinvited   0.16681    0.07617    2.19   0.0285 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 8502  on 14765  degrees of freedom
Residual deviance: 8497  on 14764  degrees of freedom
AIC: 8501

Number of Fisher Scoring iterations: 5

Sure enough, there's the significant effect I saw in the simple Chi^2 test. --Halfak (WMF) (talk) 18:54, 20 October 2015 (UTC)