Research:WikiProject Recommendation

From Meta, a Wikimedia project coordination wiki
Aaron Halfaker
Jonathan Morgan
Duration:  2017-June – 2018-May
This page documents a completed research project.

The primary objective of this study is to build a recommender system in Wikipedia to help WikiProjects identify and recruit new members to join and contribute, and socialize new editors in Wikipedia. We will evaluate the effectiveness of different recommendation algorithms.


Online production communities, such as Wikipedia and Github, have become a very popular means of knowledge creation. To help users more efficiently collaborate with each other in the community for their specific needs, many subgroups are formed within the large communities; for instance, WikiProjects in Wikipedia is a prominent example where editors organize themselves to work together around specific topics in Wikipedia. However, many groups still struggle in finding dedicated and productive users to contribute. At the same time, many editors have difficulty in getting involved into the community, especially, for newcomers in Wikipedia.

Originated from the commercial context, many recommendation algorithms have been developed in recent years in different domains for communities or groups, such as TV programs, travel, etc, for helping users quickly identify what they might like from tons of items. We applied the idea of recommendation algorithms for this group-user matching problem. More specifically, we will develop algorithms to help those subgroups within the larger communities identify and recruit relevant users who potentially will succeed in the group by contributing more and staying longer in the group.

Research Question[edit]

  • Understand the effectiveness of recommendation algorithms on recruiting new editors for WikiProjects.
  • Evaluate the effectiveness of this intervention on engaging and retaining Wikipedia newcomers.



In this study, we will develop a recommendation system to help WikiProjects recruit new members. More specifically, the system will recommend Wikipedia editors to the project organizers of WikiProjects, and the project organizers will approach and recruit those recommended editors for the project. We will evaluate the actions of project organizers and response of editors for the effectiveness of the recommendations.

Candidate Editors to be Recommended to WikiProjects[edit]

First, we need to decide who are the candidate editors to be recommended to WikiProjects. There are two distinguished groups of editors we think are particularly interesting and worthy for investigation - newcomers and experienced editors in Wikipedia, which leads to two different emphases on the goal.

1. Experienced Editors

The other pool of candidate editors is the experienced editors in Wikipedia. For instance, those are the editors who have made more than 100 edits in Wikipedia (according to the Research Term of WikiMedia Foundation). Those editors know the basics of Wikipedia, and generally are dedicated editors. If they can be recruited successfully, they are expected to have direct impacts on the projects. In addition, since those editors have done enough edits which provides sufficient data points for analysis, we can develop more complicated algorithms to increase recommendation accuracy and make it a satisfied matching for both the project and the editor. The population of those experienced editors, however, is much smaller compared to Wikipedia newcomers. The impact of recruiting those editors might be smaller than that of targeting at newcomers for the entire community. The goal of targeting at this cohort is to understand the effectiveness of different algorithms.

2. Newcomers

In Wikipedia, only 10% editors stay and continue to make edits after their first edit. How to effectively socialize and retain newcomers has been an enduring problem for the community. Consider a scenario where newcomers are reached by a project organizer after their first couple edits, and welcomed to participate some project they might like. Guided by the members in the project, they start to learn more about how to edit in Wikipedia, continue to contribute to the project and the community, and finally become dedicated Wikipedians. Would this be great? If this intervention works, it will be a huge contribution to the community due to the large population of Wikipedia newcomers. Therefore, targeting at Wikipedia newcomers is an interesting problem, despite the fact that many new editors will not continue to work by nature which may waste the efforts of the project organizers who do the recruitment. In addition, since we don't have sufficient data points for those newcomers who just make a couple edits, the recommendation algorithms will be very simple and even not accurate, for instance, it can be the projects of the articles the newcomers edit. In summary, the goal of targeting at this cohort is to understand the intervention of recommendations.

Targeted WikiProjects and Project Organizers[edit]

1. Targeted WikiProjects

We will conduct our study in about 10 WikiProjects. The project should be relatively active and well-developed so that the recruited editor can be warmly socialized and sufficiently guided. According to the Wikipedia database report in 2016 (hopefully, we can obtain a latest list), we will choose the top active WikiProjects.

Here is the list of candidate projects: WikiProject Film, WikiProject Biography, WikiProject Women in Red, WikiProject U.S. Roads, WikiProject Oregon, WikiProject Opera, WikiProject Politics, WikiProject Ethnic Groups, WikiProject Novels, WikiProject Plants, etc

2. Project organizers who recruit editors

We will recruit about 20 project organizers who are experienced in the project to participate our study (more if needed), because we wish those project organizers can act on behalf of the project, and recruit for the project.

In order to recruit the project leaders, we will post a message on the project talk page of those targeted projects to introduce our study and ask for 2-3 volunteer participates from each project.

Recommendation Algorithms[edit]

We are proposing four sets of recommendation algorithms with different rationals behind.

1. Topic-based Recommendation

Rank editors by the alignment between editor’s topic interest and project’s topic coverage. This approach is similar to classic content-based recommendation.

2. Member-based Recommendation

Rank editors by their communication with project members. This approach is based on editor's social relationship in Wikipedia.

3. Rule-based Recommendation

Rank editors by their edits of articles claimed within the scope of the project.

4. User-user Collaborative Filtering Recommendation

Rank editors by their editing similarity on articles compared to the project members. This approach is derived from the classic collaborative filtering recommendation.

Experimental Design[edit]

1. Within-subject Design

We will conduct a within-subject experiment to understand the effectiveness of different recommendation algorithms. More specifically, each project organizer will see recommendations from all the algorithms.

2. Multiple Rounds of Recommendations

To simulate the real-world situation, the system will deliver 3-4 batches of recommendations to the project organizers who participate our study, with 12-15 editors to recommend each time. We will set a gap of 1-2 weeks between each batch to give time for project organizers to absorb and react on our recommendations.

3. Recruiting Tips

In order to help project organizers recruit those recommended editors, we will provide some general tips for recruitment, for instance, to encourage project organizers to write personalized welcome messages, make specific task requests, or provide resource to start, etc. In addition, we will provide the explanations of our recommendations to assist project organizers to make decisions in recruiting the editors.

Evaluation Plans[edit]

To evaluate the effectiveness of the intervention of algorithmic recommendations, we will do the evaluation for each algorithm from the perspectives of the project organizers and editors:

1. Project Organizers

1.1 Survey on each recommended editor for project organizers:

  • Q1: Do you think this editor would be a good fit for the project? [5 likert scale]
  • Q2: When would you like to recruit this editor? [single choice] - Now / Later / Never
  • Q3: Why would you like this recommended editor, or why not? [open ended]

1.2 Reach-out Rate (in 3 days): if the project organizer recruits editors (i.e., posting a recruiting message on editor's user talk page)

2. Editors

2.1 Reactions on Recruitment

  • Response Rate (in one/two week): if the editor responds the recruiting message from the project organizer
  • Participation Rate (in one/two week): if the editor participates the project (i.e., edit on project pages or related articles)

2.2 Performance Change

  • Activity changes before and after being recruited by the project organizers (i.e., contribution and retention in Wikipedia)
  • Activity difference comparing to similar editors who were not recruited (i.e., contribution and retention in Wikipedia)


  • June 2017: Collect more opinions and feedback from Wikipedia community, and develop the system.
  • July - August 2017: Develop the system and conduct our study.
  • August - September 2017: Improve the system and collect data.

Policy, Ethics and Human Subjects Research[edit]

The WikiProject Recommendation study is in the process of evaluation by the IRB in the University of Minnesota for exemption.