Grants:Project/MSIG/Welcoming Newcomers Research Initiatives: Analysis, Tools, and Implementation

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search
Greetings from the year 2030.png
Welcoming Newcomers
Research high-impact ideas to support editor retention. Develop and verify hypotheses, create tools and prototypes based on them, and engage with Wikimedia affiliates to obtain feedback and advance towards their large-scale implementation.
targetNewcomers of the Movement Research community
start date23 June 2022
start year2022
end date31 January 2023
end year2022
budget (USD)$24,800
grant typeresearch / individual
marcmiquel (_AT_)
Review your report

Applications are not required to be in English. Please complete the application in your preferred language.

Project Goal[edit]

What will be the outputs of your project and how will those outputs contribute to advancing a specific Movement Strategy Initiative

This project aims at developing research and tools to design more effective initiatives to welcome newcomers. The ultimate objective is to be able to validate these initiatives and advance on their implementation.

In this sense, this project wants to research areas that have a potential for high impact, supported by previous research, but are currently overlooked in the Movement. In specific, the project “Welcoming Newcomers” proposes studying these two research areas:

  1. Analyzing the policy & guidelines pages according to different criteria to generate valuable information to help editors keep them up-to-date or delete them.
  2. Understanding the impact of homophily in peer interactions on editor retention to improve the effectiveness of current mentorship features, programs, or events.

Additionally, we will open a discussion space to collect ideas on how to empower all Wikimedians to improve the User Experience. Because we understand that changes will only occur if community members, technical contributors, and WMF product team contributors are aligned. We will first collect the opinions of different types of stakeholders (from the community editors and affiliate members to technical contributors). Then, based on their points of view, we will generate one practical proposal (a framework) to bring the stakeholders together and streamline the implementation of changes. This framework will be presented in the form of a document (paper) / Meta-wiki page, and a request for support from stakeholders.

Editor retention is the product of how Wikipedia welcomes newcomers. This relates to bureaucratic and social aspects (research areas 1 and 2) and the usability and UX of the user interface/tooling (discussion space). This project will add to the existing initiatives like those carried out by the Growth Team at the Wikimedia Foundation.


What specific Movement Strategy Initiative does your project focus on and why? Please select one of the initiatives described here

The project aims at supporting the growth of the Wikimedia communities by researching and designing solutions for more effective ways to welcome newcomers.

Scope and methodology

Even though a subset of languages will be preferred for the initial analyses (Catalan, Spanish, Italian, Polish, and Romanian, chosen for the team’s access to the communities and languages), the expectation is to be able to generate insights or tools taking into account all the languages (as language-agnostic as possible). Furthermore, the problem of editor retention is common to all Wikipedia language editions. Therefore, we believe that the study results should not be limited to a group of them.

The central methodology of the project will be computational data analysis. However, to explore the use of specific aspects to measure and validate the final results (applicable insights and a tool), other methods such as interviews and focus groups will be employed.

This research project will be carried out in collaboration between Wikimedia researchers, Wikimedia affiliates’ staff, and community members to make a scientific contribution that addresses the community needs and produces functional outcomes.

Wikimedia Strategy 2030

This project relates to several Wikimedia Strategy 2030 recommendations and initiatives[1].

Recommendation 2. Improve User Experience

           9. Methodology to improve the Wikimedia platform UX research, design, testing, and community engagement.

           9. Community engagement around product design and UX.

           12. Peer-to-peer spaces

Recommendation 5. Coordinate Across Stakeholders

           29. Enhance communication and collaboration capacity with partners and collaborators

           30. Technology Council (for improved communication, coordination, and support)

Recommendation 7. Manage Internal Knowledge

           34. Facilitate a culture of documentation

Recommendation 10. Evaluate, Iterate, and Adapt

           42. Monitoring, evaluation, and learning at all levels with support and mutual accountability

Project Background[edit]

When do you intend to begin this project and when will it be completed?

Start date: May 2022 End date: November 2022

Where will your project activities be happening?


What specific challenge will your project be aiming to solve? And what opportunities do you plan to take advantage of to solve the problem?

The challenges that this project aims to address:

Editor retention decline. Even though there have been several attempts to improve editor retention, it has not been possible to revert to any Wikipedia language editions. In addition, the complexity of this issue and its dependence on multiple factors such as user interface, policies & help pages, and social interactions have made it challenging to tackle all at once.

Improving each of these factors requires research, design, and coordination to execute and implement changes. And it is precisely in coordination where there is an extra challenge, given the need for awareness, common understanding, and cooperation between the different Wikimedia actors (communities, affiliates, and Wikimedia Foundation), which sometimes stalls or slows down processes.

In this sense, this project aims to take advantage of the independent stance of the researcher to work closely with each of the three actors, and especially with Wikimedia affiliates, given their strategic position, the capacity to plan, and at the same time composed of community members, which grants them a special influence on the implementation of changes.

These are the opportunities this project aspires to seize:

State of the Art. Each research area is based on a critical amount of research studies that enable framing the problems with mature methods and being specific on the hypotheses.

Agile Approach. The work in each research area will follow an agile development approach to both (1) publish the reports and (2) validate the applicability of the research insights with communities.

Prototyping solutions. The work in each research area will result in the practical implementation of the insights (a prototype) to ensure their usefulness and trigger the next steps.

Independence. Each research area will be perceived as an effort to improve newcomer retention. Being an independent Wikimedia researcher allows taking a more detached position from the different stakeholders to progress on implementing the various proposals.

Does this project aim to apply one of the examples shared in the call for grants and if so which one?


Does this project aim to apply one of the examples shared in the call for grants and if so which one?
  • Raising awareness and opening the debate on valuable areas.
  • Producing reports using quantitative and qualitative data.
  • Transforming scientific insights into recommendations and tools.

Project Activities[edit]

What specific activities will be carried out during this project? Please describe the specific activities that will be carried out during this project.
  • Research on three ideas that can have a high impact on newcomer retention.
  • Transform the research insights into concrete solutions.
  • Disseminate state of the art around these topics and the research results.
  • Discuss the results and prototypes to collect feedback on them.
  • Invite affiliates and other stakeholders to deploy these initiatives.

We proposed 2 research areas, each of them following this same process of 7 steps:

1.     State of the art

2.     Community engagement (exploration)

3.     Data collection and analysis

4.     Results / Prototype

5.     Community engagement (validation)

6.     Report / Presentation

7.     Next steps with affiliates & stakeholders.

Research Areas[edit]

Research Area 1: “Policies & Guidelines Clean-up.”[edit]

Research Goal

“Analyzing the policy & guidelines pages according to different criteria to be able to generate valuable information to help editors keep them up-to-date or delete them.”

Problem definition

Calcification of policies and help pages has been detected as one of the factors that have led to a community decline in terms of the number of active editors since 2009. Given their long-term involvement and the lack of general indicators, many Wikipedians may not be aware of which policy and help pages have more readership, of their level of clarity, and the last time they were updated.

Initial hypotheses

- There is a long tail of pages that receive no attention from editors but obtain pageviews.

- The editors who have updated the Policies & Guidelines pages tend to be veteran editors.


- Data analysis and data visualization for 300 languages to analyze the pages.

- Focus group to understand the needs and potential criteria for outdated pages.

Applicability / Solution proposal

- Creation of a tool (Dashboard) to show the most read and most and least edited policy & guidelines pages.

- Creation of guidelines to support the improvement and clean-up of pages to increase their readability.

- Support Wikimedia Affiliates in including actions to clean policies and create other resources.

Why is it high impact

A good analysis/visualization that facilitates editors identifying outdated policy/help pages can be beneficial to support the creation of “routines” to clean policy pages, update the important ones, and ultimately test their effectiveness on newcomers. As a rule of thumb, the most read pages should be the ones more tidy and revised.

Generally, improving on the most-read policy/help pages has the potential to prevent newcomers’ from leaving the project after the first edits. The data is already available in the form of dumps, but it is not exposed in an accessible form like a dashboard.

Strategy recommendation

This research will support Rec. 2. “Easy-to-find and easy-to-understand resources for newcomers, including onboarding media and guiding interfaces helping them independently learn and navigate.”


Initially, the analyses will be focused on Catalan, Spanish, Italian, Polish, and Romanian. Experiments will then be extended to other languages. Once the results/prototype are ready, we will work with Wikimedia affiliates to include in their Annual Plans events/actions/plans related to the cleaning/improvement of policy/help pages.

Follow-up tasks

After discussing the creation of these routines with some affiliates, the follow-up tasks will expand their implementation to all Wikipedia language communities.

Update 18 June 2022: We will dedicate an initial phase of the research to study the variability of cases in policy/help page development in order to detect special cases in which there may be some sort of policy/help page breakdown (e.g., this has been reported as the current case of Croatian and Armenian Wikipedias), edit wars, or any anomaly in general. These Wikipedia language editions will be taken into special consideration to develop this study.

Research Area 2: “Homophily in Interactions.”[edit]

Research Goal

“Understanding the impact of homophily in peer interactions on editor retention with the aim of improving the effectiveness of current mentorship features, programs, or events.”

Problem definition

Adequate online and offline socialization of new contributors is vital for the long-term sustainability of Wikipedia. Editors invited to Teahouse are retained at a higher rate. The Growth team is working on features to incentivize mentor-mentee relations. Gender Gap affiliates organize events in which they mentor newcomer women editors. Nonetheless, mentorship is not widely spread yet, and communities are not as diverse as they could be. Editor retention could benefit from more and more effective mentorship.

Initial hypotheses

  • Receiving messages on one’s user talk pages from other editors may increase the editor retention of certain profiles of newcomers.
  • The mentor-mentee relations with editors having the same gender or same geographic provenance may be more effective in increasing editor retention.
  • Edit-a-thons and in-person events increase the potential for editor retention and overall contribution in an editor’s lifetime.


  • Focus group.
  • Data analysis.

Applicability / Solution proposal

  • Introducing the criterion of diversity in mentorship programs and features.
  • Providing some guidelines and prototypes to show how the conclusions could ignite further changes.
  • Suggesting some aspects to improve in-person events and edit-a-thons.

Why is it high impact

Retention rate and diversity are two related areas that can benefit significantly from any improvement. Knowing whether homophilous interactions (messages between editors with some characteristics in common) can lead to retention or not may be precious for designing more effective mentorship programs and events.

Strategy recommendation

This research would support Rec. 2. “Spaces that allow finding peers with specific interests, roles, and objectives along with communication channels to interact, collaborate and mentor each other."


Initially, the analyses will focus on Catalan, Spanish, Italian, Polish, and Romanian, since these are the languages spoken by the project participants. Experiments will be then extended to other languages.

Given the interest of the Hausa User Group, we will pay attention to Hausa along with other African languages (Igbo, Yoruba, and Twi).

Wikimedia Affiliates and Growth Team are the primary stakeholders. We believe this may be interesting also for gender groups, given their efforts in bridging the community gender gap.

Follow-up tasks

After testing the hypotheses, there will be sessions for discussing the results with Affiliates and with the Growth team to benefit from the conclusions.

Other research hypotheses:

Our aim in this second research area “Homophilous interactions” is to study the impact of social interactions through different hypotheses. At the same time, we believe this research needs to be flexible and consider other parallel questions that can be useful to stakeholders. For example, we already detected an interest in measuring retention taking into account edits across languages (multilingualism) or the impact of becoming a member of an Affiliate in the long-term engagement. If you have any idea on a social aspect that may matter to retention, we could study approaching it.

Discussion space: How Can We Empower Wikimedians to Improve UX?[edit]


We want to collect ideas on how to empower all Wikimedians to improve the User Experience and ultimately generate a framework that helps align the Wikimedia stakeholders, streamlining the discussions, and moving forward with the implementation of changes required for a better UX.

Context for the discussion

User Interface changes only occur if community members, technical contributors, and WMF product team contributors are aligned. However, not everyone has the same point of view because not everyone shares the same background and has the same understanding of what matters to the user experience.

This depends on factors such as:

  • Awareness of the usability and impact of tools/extensions on specific user profiles (e.g., Visual Editor on newcomers).
  • Participation in the past discussions that led to the implementation of tools/settings over the 20 years of life of Wikipedia.
  • Involvement in the decision-making process to implement a specific extension/feature/setting that is important to the user experience.

While these are three key aspects that contribute to the misalignment between actors, we believe that other bottlenecks exist that prevent a more efficient and fluent deployment of user interface changes aimed at all kinds of Wikimedians. We plan to uncover them to make one practical proposal (a framework) to bring the stakeholders together and streamline the implementation of changes.

This framework will be presented in the form of a document (paper) / Meta-wiki page, and a request for support from stakeholders.


- Interviews with a representative group of stakeholders (editors, technical contributors, designers, etc.).

- Analysis of the available documentation (Village pump discussions, Phabricator tickets, etc.) related to user interface changes and configuration deployment.

- Idea generation through an iterative process.

- Focus group discussions on the different solutions that emerge from the opinions.

The discussions will try to answer the three following key questions:

  1. Does every stakeholder have all the information to promote a good user experience? If not, what information are they missing?
  2. Do the processes to decide on how to improve the user experience involve all relevant actors? If not, who is missing and where?
  3. Does everyone share the same interest in improving the user experience? If not, what interests are divergent?

Strategy recommendation

This research would support:

  • Rec. 2. “Involve representatives of designers, technical developers, communities, and user profiles in iterative user experience (UX) design, research and result dissemination processes to raise awareness and enable better decision-making.”
  • Rec. 2. “Documentation and standards to capture platform functionalities for evaluation, learning, and improvement.”
  • Rec. 7. "Create clear documentation regarding infrastructure development and scalability to engage more Movement stakeholders transparently.”


Wikimedia Affiliates and WMF Growth Team. We will devote special attention to Catalan, Polish and Italian Wikipedia.

Follow-up tasks

The framework resulting from this discussion space will be shared with stakeholders to define its implementation.

We will generate one practical proposal (a framework) to bring the stakeholders together and streamline the implementation of changes. This framework will be presented in the form of a document (paper) / Meta-wiki page, and a request for support from stakeholders.

Dissemination and Participants[edit]

How do you intend to keep communities updated on the progress and outcomes of the project? Please add the names or usernames of these individuals responsible for updating the community
  1. We will grow the number of participants in the project (initially, there are 4) to ensure that we cover the community needs in the initial stage of the research and validate the final results.
  2. We will generate final reports and documentation disseminated through different community channels, including the village pump, IRC, mailing list, and Social Media.
  3. We will participate in Wikimedia conferences creating sessions to raise awareness on what we can do to “welcome newcomers,” discuss research results, and advance on implementing solutions. Some of these conferences are Wikimania, Wikimedia CEE, WikiArabia, Wikiindaba.
  4. We will disseminate project results, not only in the form of text, but also in video and graphic documentation.

Who will be responsible for delivering on this project and what are their roles and responsibilities?

This project will be led by Dr. Marc Miquel-Ribé (marcmiquel), Ph.D., a Wikimedia Researcher with long-experience in all the phases of project development, data analysis, and community engagement. He has been involved in previous research projects independently and in collaboration with the Wikimedia Foundation research team; he’s a long-term involved Wikimedian in the Catalan Wikipedia, in the Wikimedia Strategy 2030 process, member of Amical Wikimedia, among other projects.

Other project participants be part of team to give support to the research and dissemination in a volunteer capacity:

  • Research and prototyping
    • Dr. David Laniado (Sdivad) as research advisor. He has published over 20 academic papers on different aspects of social interactions in Wikipedia. He is a co-creator of the Contropedia platform for the analysis and visualization of controversies in Wikipedia articles.
    • Rita Ho RHo (WMF), as design/UX advisor. She joined Wikimedia in March 2016 and works mainly as the designer on the Growth team, as well as leading design direction on other products supported by the Wikimedia Design team.
  • Discussions, community engagement, and applicability

The team will be crucial in ensuring alignment between the research and the community/research needs, as well as the dissemination and validation of the applicability of the insights/prototypes.

While we kept this list short because of the data-oriented nature of the research, anyone is invited to participate in the discussions, be they in the research studies design or their dissemination. Simply add your name here.

Additional information[edit]

If your activities include community discussions, what is your plan for ensuring that the conversations are productive? Provide a link to a Friendly Space Policy or UCoC that will be implemented to support these discussions.
If your activities include the use of paid online tools, please describe what tools these are and how you intend to use them.
Do your activities include the translation of materials, and if so, in what languages will the translation be done? Please include details of those responsible for making the translations.
Are there any other details you would like to share? Consider providing rationale, research or community discussion outputs, and any other similar information, that will give more context on your proposed project.


We have proposed 2 activities/research areas, each of them will be following this same process with seven steps. Finally, we detail each of them the outcomes.

1.     State of the topic

○       Wikimedia Meta-page page including the state of the topic for discussion.

○       Visual presentation of the state of the topic to share among communities.

2.     Community engagement (exploration)

○       Presentations include state of the art and the research results.

○       Focus group session report published.

3.     Data collection

○       Data generated is available as dumps/databases in a Wikimedia server for further research.

○       Code in GitHub.

4.     Results / Prototype

○       Specific proposal designs prototypes that enable exploring the functionalities.

i.         Mock-up in commons / meta-wiki page.

ii.         Functional prototype in

5.     Community engagement (validation)

○       Focus group session report published. Community discussion feedback report with perceptions and validation.

6.     Report / Presentation

○       Slides / Video support generated in commons.

○       Research report and scientific paper submitted to a research conference or an indexed academic journal.

7.     Next steps with affiliates & stakeholders.

○       Guidelines for Stakeholders to implement these and other initiatives.


After your activities are complete, we would like to understand the draft implementation plan for your community.

You will be required to prepare a document detailing this plan around a movement strategy initiative. This report can be prepared through Meta-wiki using the Share your results button on this page. The report can be prepared in your language, and is not required to be written in English.

After your activities are complete, we would like to understand the draft implementation plan for your community. You will be required to prepare a document detailing this plan around a movement strategy initiative. This report can be prepared through Meta-wiki using the Share your results button on this page. The report can be prepared in your language, and is not required to be written in English.

In this report, you will be asked to:

  • Provide a link to the draft implementation plan document or Wikimedia page
  • Describe what activities supported the development of the plan
  • Describe how and where you have communicated your plan to relevant communities.
  • Report on how your funding was spent

Your draft implementation plan document should address the following questions clearly:

  • What movement strategy initiative or goal are you addressing?
  • What activities will you be doing to address that initiative?
  • What do you expect will happen as a result of your activities? How do those outcomes address the movement strategy initiative?
  • How will you measure or evaluate your activities? What tools or methods will you use to evaluate your activities?

To create a draft implementation plan, we recommend the use of a logic model, which will help you and your team think about goals, activities, outcomes, and other factors in an organized way. Please refer the following resources to develop a logic model:

Please confirm below that you will be able to prepare a draft implementation plan document by the end of your grant: Of course.

Optionally, you are welcome to include other information you'd like to share around participation and representation in your activities. Please include any additional outcomes you would like to report on below: See above other outcomes.

Budget and timeline[edit]

How you will use the funds you are requesting? List bullet points for each expense. Don’t forget to include a total amount, and update this amount in the Probox at the top of your page too!

The budget of the project is dedicated to covering the different research/development/dissemination activities.

Estimated total gross of 1100 hours = $24,800.

Around 30% of this budget corresponds to taxes according to the Spanish legislation.

Research phase Researcher hours Timeline
Initial research 100 hours May to June
Research area 1. Policy & Guidelines Clean-up 300 hours May to August
Research area 2. Homophily Interactions 400 hours August to November
Discussion Space/Framework for UX 200 hours September to November
Reporting and further research areas 100 hours December

As said before, each research area comprises activities such as initial research into the “state of the art”, engagement with team members and other stakeholders to collect relevant needs/requests, data retrieval and processing, data analysis and prototype generation, engagement with team members and other stakeholders for validation of the insights’ applicability and prototype, reporting and presentations in conferences, and further engagement with stakeholders for next steps.

Although the different phases are presented sequentially, in practice they may overlap because of the possibility of organizing a meeting for the start of another research area or because of the nature of the tasks (e.g., data processing operations sometimes take days/weeks). However, there will be research outputs and outcomes published all over the project period.

Update 18 June 2022: Given that the project is still in the review phase, we consider the possibility of extending the timeline for a period of 1 to 2 months, if required. This would be posted on the timeline page of the project.


An endorsement from community members (especially from outside your community) will be part of the considerations when reviewing your application. Community members are encouraged to endorse your project request here!

  • Welcoming newcomers is something that resonates well with my understanding of the immediate needs of the wikiworld and is also reflected in my opinion of the projects I regularly interacti with. The description of the project is clear and the grantees' credentials of people who can definitely deliver results speak very strongly in their favor. I will be glad to see this project funded and will be even more glad to check out the results. Wojciech Pędzich Talk 16:14, 4 May 2022 (UTC)
  • Support Support - This seems like an interesting research project, that is addressing important areas in our strategy. It is of a special importance for newcomers, to understand the needs better, and serve this key stakeholder in our movement. I will be looking forward to see this research implemented and to read its report, that will be interesting without any doubt. -- Anass Sedrati (talk) 12:12, 5 May 2022 (UTC)
  • Support Support I have been working with Marc in research projects related to community health in the last few years and I can vouch for both the breadth of his knowledge of the topic; his capacity to plan comprehensive and interesting research, such as the one presented in this proposal; and his ability to bring them to fruition. Marc has also strong ties with the Wikipedia communities in several languages and he has been collaborating for several years with the relevant people in the WMF that are working on the topic of user growth and editor retention. I believe that this research project can provide very useful insights on the topic. --CristianCantoro (talk) 11:37, 9 May 2022 (UTC)
  • Support Support I know Marc Miquel from Wikipedia Diversity Observatory. I am very happy to see that he is will to continue doing research on Welcoming newcomers. I personally am very interested on the subject and I think Marc will do a great job. Margott (talk) 19:34, 11 May 2022 (UTC)
  • Support Support Newcomers topics and research are crucial on our future as a movement. As a frequent trainer of newcomers, any information about it is interesting to me. I look forward to the results of this research. ProtoplasmaKid (talk) 16:53, 13 May 2022 (UTC)
  • Support Support c'è bisogno di facilitare la collaborazione di nuovi wikipediani Susanna Giaccai (talk) 14:49, 9 May 2022 (UTC)
  • Support Support I know Marc and he has done a great job in the past helping the movement. Welcoming newcomers has always been a challenge and I'm sure Marc will do a very good job. Looking forward to the outcome.CEllen (talk) 18:55, 13 May 2022 (UTC)
  • Support Support Marc has been implementing very relevant action oriented research projects such as the Community Health Metrics. Also Wikipedia in Italian has benefited from his work, commitment and generosity. Forza Marc facci vedere dell'ottimo nuovo lavoro! ;-) --iopensa (talk) 15:34, 19 May 2022 (UTC)
  • Support Support I have known Marc for quite a few years and he is not just a committed Wikimedian, he is dedicated to find ways for the movement to welcome newcomers, which are seriously needed and I personally think it's a crucial aspect of our strategy. I am confident Marc will do a great job (as always) and the insights offered by his research will greatly benefit our movement. Maor X (talk) 15:53, 19 May 2022 (UTC)
  • Support Support This is a very important project, and I look forward for following up. The Wikimedia Language Diversity Hub is going to do interviews with contributors to different small language versions of Wikipedia, and in the interviews we will also talk about technical challenges for new contributors. It would be great to look for ways we could feed relevant findings from our research to this research project! Mali Brødreskift (WMNO) (talk) 06:25, 23 June 2022 (UTC)