This Growth team project will research the effectiveness of task suggestions directed at Wikipedians.
- 1 User interface components
- 2 Research questions
- 2.1 RQ 0: How will personalized recommendations impact editor productivity and retention?
- 2.2 RQ 1: How does personalization modulate the effect of recommendations on user behavior?
- 2.3 RQ 2: How does the delivery mechanism modulate the effect of recommendations on user behavior?
- 2.4 RQ 3: How does the number of recommendations presented modulate the effect on user behavior?
- 3 Studies
- 4 Results
- 5 References
User interface components
RQ 0: How will personalized recommendations impact editor productivity and retention?
In previous experiments, we have seen small but significant increases in the rate at which new editors complete at least one edit to an article when we provide them with limited recommendations. We presume that the availability of this recommendation is what is primarily driving the increased activity and that the means of delivering the recommendation is of minor importance. Therefore, whether personalized or not, delivering task recommendations should increase new editor activation rates.
- Hypothesis 0.1: Delivering task recommendations will increase new editor activation rates.
We believe that recommendations will give users a reason to return to Wikipedia for repeated editing sessions. By providing recommendations on demand, users will have a tool available to provide them something engaging to do whenever they return. In addition, we may provide email recommendations, which will invite users to return to a recommendations-driven editing session.
- Hypothesis 0.2: Delivering task recommendations will increase new editor retention rates.
RQ 1: How does personalization modulate the effect of recommendations on user behavior?
We are curious about the potential for personalization in task recommendation. Presumably, new editors are more interested in certain topic than others and their interest areas can be matched by observing their previous work. This is roughly the assumption behind User:SuggestBot and it seems to work pretty well both in Wikipedia's topical areas and in geographical areas. Editors should be more likely to find an acceptable next edit via tasks that are filtered and ranked based on their interests -- and therefore, they will be more likely to make more edits.
- Hypothesis 1.1: Delivering personalized task recommendations will increase new editor activation rates more than non-personalized recommendations.
In past experiments, we did not see significant improvements in the retention of new editors when delivering non-personalized task recommendations. However, by providing personalized task recommendations, we may be giving users general hints about what work there is to do in Wikipedia. If the previous hypotheses hold, then editors who receive personalized recommendations will know that (1) there are articles to work on in Wikipedia that are relevant to their interests and (2) they will be directed towards them when they return to Wikipedia. We suspect that this will make editors more likely to return to Wikipedia to make future edits.
- Hypothesis 1.2: Delivering personalized task recommendation will increase new editor retention rates more than non-personalized recommendations.
RQ 2: How does the delivery mechanism modulate the effect of recommendations on user behavior?
Delivering recommendations automatically after a user saves an edit immediately visible in a modal puts next steps front-and-center for the user. This will increase the likelihood that users will accept the recommendation, but also increase the likelihood that users are clicking in a more exploratory manner. Also, a single recommendation is less likely to present something extremely interesting to the user. Consequently, the post-edit recommendation will have more visibility but a lower edit rate.
- Hypothesis 2.1a: Recommendations delivered via a post-edit modal will increase the overall number of recommendations that are accepted, but reduce the rate of acceptance per recommendation set.
Research exploring how humans perform "activities" has challenged the notion of discrete tasks and advocated that activities are more of a flow of consciousness that exists in context and suffers from being interrupted. Therefore, interrupting a user in the middle of a complex activity flow might reduce their productive work and lead to both reduced productivity and lower task acceptance.
- Hypothesis 2.1b: Delivering recommendations in a post-edit modal will result in lower overall productivity than delivering recommendations on demand.
Making task suggestions a resource that can be drawn on as the user sees fit (via a flyout) decreases the visibility (and apparent availability) of recommendations overall since the an editor will need to identify the source of recommendations and perform an action (click) to retrieve them. This is likely to result in both lower recommendation view rates and lower overall acceptance of recommendations. However, since viewing recommendations requires the user to engage with the recommender system, they should be more likely to be interested in identifying a new task at that time and therefore more likely to accept an appropriate task suggestion.
- Hypothesis 2.2: Delivering recommendations via a flyout will decrease the number of recommendations viewed and accepted overall, but will result in a higher rate of acceptance per recommendation set.
RQ 3: How does the number of recommendations presented modulate the effect on user behavior?
We expect there to be a trade-off between the number of recommendations presented to a user and users' ability to choose which task to accept. While we suspect that a large, high quality recommendation set may increase users ability to find an acceptable task, psychological research on en:choice overload suggests that choosing an item from recommendation sets containing many similar, attractive items can be a very difficult task. In other words, while including many recommendations in a set may increase the probability of an acceptable recommendation being present, it might also increase decision difficulty and reduce satisfaction.
- Hypothesis 3.1a: Delivering many recommendations will increase acceptance and the rate of successful edits on recommended articles more than delivering one recommendation.
- Hypothesis 3.1b: Delivering a single recommendation will increase acceptance and the rate of successful edits on recommended articles more than delivering many recommendations.
- Additional potential research questions include
- Does a content-based recommender system produce results that are relevant enough for users, or will an alternative methodology (like collaborative filtering) be necessary to produce significant improvement in activation and retention?
- Which topics are easiest to generate high-quality recommendations for? Which are the most difficult?
We will run series of empirical studies designed to both address these hypotheses and construct principles around which a task recommendations feature will be designed.
Qualitative evaluation of morelike
We plan to use CirrusSearch morelike feature to personalize article recommendations for newcomers based on their most recent edit. In this study, we manually evaluated the system's ability to identify topically related articles based on the articles most commonly edited by English Wikipedia newcomers. We found that morelike's ability to identify topically similar articles was highly consistent for the top 10 results and usable for up to the top 50 results.
Our first experiment will address our most basic hypotheses about how personalized recommendations will increase editor activation and retention rates.
- Cosley, D., Frankowski, D., Terveen, L., & Riedl, J. (2007, January). SuggestBot: using intelligent task routing to help people find work in wikipedia. In Proceedings of the 12th international conference on Intelligent user interfaces (pp. 32-41). ACM.
- Priedhorsky, R., Masli, M., & Terveen, L. (2010, February). Eliciting and focusing geographic volunteer work. In Proceedings of the 2010 ACM conference on Computer supported cooperative work (pp. 61-70). ACM.
- Nardi, B. A. (Ed.). (1996). Context and consciousness: activity theory and human-computer interaction. Mit Press.
- G. Haynes. Testing the boundaries of the choice overload phenomenon: The effect of number of options and time pressure on decision difficulty and satisfaction. Psychology and Marketing, 26(3):204--212, 2009.
- Bollen, D., Knijnenburg, B. P., Willemsen, M. C., & Graus, M. (2010, September). Understanding choice overload in recommender systems. In Proceedings of the fourth ACM conference on Recommender systems (pp. 63-70). ACM.