Research talk:Notifications

From Meta, a Wikimedia project coordination wiki

RQ: What is the conversion rate for different types of notification?[edit]

What's meant by "conversion rate"? --EpochFail (talk) 21:29, 22 May 2013 (UTC)[reply]

Instrumenting Echo via EventLogging will allow us to measure two types of funnel metrics: (1) the proportion of notifications seen among all notifications delivered to the user (with the exception of email notifications, where impressions cannot be logged due to privacy restrictions); (2) the proportion of notifications clicked-through among all notifications seen by the user. Delivered is captured by server-side instrumentation (Schema:Echo, Schema:EchoMail), seen and clicked-through via client-side events ([[Schema:EchoInteraction). The reason for instrumenting clicks and impressions individually is that we can't measure conversions in aggregate, due to not just different user behavior but also the different volume of notifications received by users. --DarTar (talk) 20:51, 28 May 2013 (UTC)[reply]

RQ: How does the use of private notifications compare to public ones (e.g. talk pages)?[edit]

What do you mean by "private notifications"? --EpochFail (talk) 21:36, 22 May 2013 (UTC)[reply]

All Echo notifications are private by design, to match user expectations and best practices across top web sites. Fabrice Florin (WMF) (talk) 01:53, 24 May 2013 (UTC)[reply]
I understand that the fact of a notification appearing is private, but the event for which the notification was generated is often not private (e.g. talk page post, page review action and revert). Is this the distinction? --EpochFail (talk) 15:01, 24 May 2013 (UTC)[reply]
That's correct, there is a subset of notifications that are not only private but for which the notification-triggering event is also not publicly accessible (like notifications in the system category). --DarTar (talk) 20:54, 28 May 2013 (UTC)[reply]

RQ: How do notifications change user behavior?[edit]

I think this is our key research question and the one that I ought to spend the majority of my effort exploring. In order to do this, it would be nice to get a sense for where our open design questions are.

For example, it seems that a decision was made to not turn on notifications about reverts for new users by default. Some of my preliminary data suggests that a substantial proportion of new users opted to turn that feature on. It would be nice to know whether the feature generally encourages bad behavior or not.

I'd also like to see what effect taking away some types of notifications has. Here, I think page review is a good candidate. It seems likely that notifications about page review -- especially bringing a newcomer page creator back to participate in their own deletion discussion -- would improve the experience of being new to Wikipedia.

To test these effects, I'd like to run an experiment (or two) where we change the default preference settings for a random sample of newcomers and measure their behavior. Is this possible? --EpochFail (talk) 22:45, 22 May 2013 (UTC)[reply]

Hi EpochFail, thanks for starting this discussion. I look forward to working with you on this research project. From a product standpoint, we would like to start with these first research questions, as outlined in our metrics plan. We're making good progress towards getting the first five questions answered, through dashboards like this one -- and these upcoming dashboards. So any help you can provide to completing these dashboards will be helpful, so we can track these metrics on an ongoing basis.
Once we have answers for these first five questions, our next big priority for Echo research is question 6: "Do notifications help people become more productive?" More specifically, do new users who receive notifications tend to edit pages more often than users who do not? Are their edits successful? We hope to answer this question about user productivity through this cohort analysis, which we would like to run in the first week in June, before the Visual Editor research starts impacting our ability to bucket users into two cohorts.
A related question that is important to us from a product standpoint is almost the same as question 6, but focused on new users who receive 'thanks notifications'. Specifically, does getting a thanks notification improve the productivity of users who receive them? Do they tend to edit more as a result, and do their edits survive? Because of our time limitations, our hope is that we can answer that question by using the same data collected to answer question 6 -- or by re-using the same instrumentation as part of a second cohort study right after the first one.
The other research questions you propose above also seem interesting, once we're answered the first questions outlined above. But we have limited resources on this project, so we would like to stick to our current priorities for now, if that's OK with you. I hope this is helpful and I look forward to our next steps together. Fabrice Florin (WMF) (talk) 01:53, 24 May 2013 (UTC)[reply]
The current plan is to run two types of controlled tests on new users: the first experiment is to test the overall effect of notifications, i.e. what is the overall impact on newcomers of entirely disabling notifications (both email and on-site notifications). The second type of tests is focused on individual notification types that we believe may positively or negatively affect the productivity and retention of new users. The obvious candidates are pagetriage-add-deletion-tag and reverted among notifications with an expected negative impact and edit-thank among notifications that we expect to sustain participation. We will not focus at this stage on testing notifications for which we cannot make direct product decisions, even though we know that they are likely to affect participation (like mentions). As for what to prioritize, we have a maximum of two 1-week cycles to run tests in June and we need to decide what we want to test vs what we can defer to a later stage or deprioritize completely. We should also consider that we're unlikely to capture significant effects in a week for types of notifications that are likely to produce small samples or for which we expect to see small effects. For example, we can make a good estimate of how many revert and page-deletion notifications will be delivered to newcomers, but it's hard to predict the volume of recipients among newcomers of thanks notification which have never been tested on the English Wikipedia. I think our best bet is to focus on the effects of "negative" notifications and on disabling entirely notifications now and defer to a later stage testing the "positive" ones. --DarTar (talk) 21:14, 28 May 2013 (UTC)[reply]