Jump to content

Research talk:Reading time/Work log/2018-11-21

Add topic
From Meta, a Wikimedia project coordination wiki

Wednesday, November 21, 2018

[edit]

Here we have a results from regressions where we have a "desktop" variable instead of a "mobile" variable (this was requested by User:Tbayer_(WMF). Obviously, results from the model are equivalent to the models with "mobile". Now we observe that, accounting for other factors in the model) dwell times are actually longer (at least in the global north) on mobile on page views that are not the last page view in a session. On the other hand, when we consider the last page view in a session, dwell times are much longer on desktop than on mobile; we also see a substantive gap between the global north and the global south on desktop (less so on mobile). These observations are consistent with findings from the wikimotifs paper, which suggested that readers in global south countries are most likely to engage in more intensive information seeking tasks. Our results suggest that (1) compared to other views in a session, the last view in a session is most likely to involve in-depth information consumption through longer reading and that (2) readers in the global south spend more time reading than readers in the global north, and this can be mostly accounted for by the views in the last session, when we expect the most information to be consumed, and on the non-mobile site where we expect (mainly due to prior theory) people will choose to perform their deeper information seeking tasks, if they have the choice. This pattern is supported not only by the regression models, but also by non parametric analysis comparing medians of the dwell times by device, global north-global south, and last-in-session.

Statistical models
model 2 model 3
Intercept 8.1783 (0.0084)*** 8.2388 (0.0084)***
mobile 0.0962 (0.0015)*** 0.0006 (0.0023)
Human Development Index -0.1007 (0.0009)*** -0.1613 (0.0014)***
mobile : HDI 0.1059 (0.0019)***
Revision length (bytes) 0.1752 (0.0004)*** 0.1752 (0.0004)***
time to first paint -0.0164 (0.0006)*** -0.0163 (0.0006)***
time to dom interactive 0.0023 (0.0009)** 0.0023 (0.0009)**
sessionlength -0.0001 (0.0000)*** -0.0001 (0.0000)***
lastinsessionTRUE 0.9281 (0.0015)*** 0.9232 (0.0015)***
nthinsession 0.0002 (0.0000)*** 0.0002 (0.0000)***
dayofweekMon 0.0940 (0.0020)*** 0.0940 (0.0020)***
dayofweekSat 0.0189 (0.0020)*** 0.0171 (0.0020)***
dayofweekSun 0.0336 (0.0020)*** 0.0324 (0.0020)***
dayofweekThu 0.0563 (0.0019)*** 0.0563 (0.0019)***
dayofweekTue 0.0350 (0.0020)*** 0.0353 (0.0020)***
dayofweekWed 0.0760 (0.0019)*** 0.0758 (0.0019)***
usermonth4 0.0095 (0.0096) 0.0096 (0.0096)
usermonth5 0.0113 (0.0095) 0.0111 (0.0095)
usermonth6 -0.0097 (0.0097) -0.0100 (0.0097)
usermonth7 -0.0487 (0.0097)*** -0.0494 (0.0097)***
usermonth8 -0.0112 (0.0097) -0.0118 (0.0097)
usermonth9 0.0377 (0.0076)*** 0.0383 (0.0076)***
usermonth10 0.0002 (0.0075) 0.0000 (0.0075)
mobileTRUE:lastinsessionTRUE -0.6508 (0.0021)*** -0.6442 (0.0021)***
R2 0.0717 0.0719
Adj. R2 0.0717 0.0719
Num. obs. 9873641 9873641
RMSE 14.2360 14.2338
***p < 0.001, **p < 0.01, *p < 0.05


This chart shows how reading times predicted by a regression model change as the development level of the country changes. Readers from the global south read for longer than those in the global north, especially on the last view in a session. The difference between mobile and desktop is mainly a difference in last in session behavior.
Marginal effects plot for model 3 by globalsouth and lastinsession. This chart shows how reading times predicted by a regression model change as the development level of the country changes. Readers from the global south read for longer than those in the global north, especially on the last view in a session. The difference between mobile and desktop is mainly a difference in last in session behavior.

The qualitative conclusions of this table do not depend on the parametric assumptions. The table below shows that the means of the raw data are quite close to the predicted parameters for all cells of the table.

LastInSession economic_region Desktop exp(avg(logvislen)) med_visiblelength
0 False Global North False 20162.721149 20109
1 False Global North True 17239.418437 16185
2 False Global South False 21574.202414 21554
3 False Global South True 22958.013098 21804
4 True Global North False 26669.747291 28178
5 True Global North True 46793.402929 39840
6 True Global South False 27948.535150 28684
7 True Global South True 50520.718672 43630

Interpretation for page length

[edit]
This chart shows how reading times predicted by a regression model change as the development level of the country changes. This plot shows how the model predicts reading time will change as pages get longer. The difference between very long and and very short pages is associated with up to a 40 second increase in expected reading time in the last session and about 15 seconds for views that are not the last in session. For typical pages, a doubling of the length of the page is associated with an increase of about 4 seconds for non-last-in-session views and about 7 seconds for last-in-session views.
Marginal effects of page length. This chart shows how reading times predicted by a regression model change as the development level of the country changes. This plot shows how the model predicts reading time will change as pages get longer. The difference between very long and and very short pages is associated with up to a 40 second increase in expected reading time in the last session and about 15 seconds for views that are not the last in session. For typical pages, a doubling of the length of the page is associated with an increase of about 4 seconds for non-last-in-session views and about 7 seconds for last-in-session views.