Research talk:Reading time

From Meta, a Wikimedia project coordination wiki

Work log



Lab Notebook[edit]

Interpreting Exponentiated Weibull Models[edit]

Now I'm going to work on interpreting Exponentiated Weibull models and I'm going to tabulate the frequency of qualitatively different distributions by wiki.

The Exponentiated Weibull has 3 parameters. Two are shape parameters ( and ) and one is a scale parameter (). The major qualitative distinctions in interpreting the model are in terms of the shape parameters.

According to this analysis of the Exponentiated Weibull:

  • If and then we have an exponential distribution with parameter .
  • If we have a Weibull distribution.
  • In this case the failure rate is always increasing (positive ageing) if and always decreasing (negative ageing) if .
  • If then we have a exponentiated exponential distribution and the failure rate may not be monotonic.
  • In this case, and if then the failure rate increases when .
  • On the other hand if then the failure rate decreases when .
  • If and then we have positive ageing (the failure rate is increasing).
  • If and then we have negative ageing (the failure rate is decreasing).
  • If the two shape parameters have opposite signs then interpreting the model may require closer inspection of hazard and/or survival curves.

Inconveniently, it looks like almost all of the time we have and .

expweib_tab = table[table.model =='exponweib'].copy().reset_index()
expweib_tab['a'] = expweib_tab.params.apply(lambda r: r[0])
expweib_tab['c'] = expweib_tab.params.apply(lambda r: r[1])
expweib_tab['scale'] = expweib_tab.params.apply(lambda r: r[3])
expweib_tab['a_ge_1'] = expweib_tab.a >= 1 
expweib_tab['c_ge_1'] = expweib_tab.c >= 1 

expweib_tab = expweib_tab.drop(['level_0','index'],1)
pd.crosstab(expweib_tab['a_ge_1'],expweib_tab['c_ge_1'])

Note that in the code a = and c = .

False True
False 0 1
True 241 0

So, inconveniently, I don't know what we can say qualitatively about reading times just from looking at the parameter estimates.

Next I'm going to plot hazard functions for a handful of wikis to see if there is anything we can say in general about these parameters from the distributions.