Learning patterns/Measuring a WikiProject's community and activity
What problem does this solve?
On larger wikis, especially the English Wikipedia, it may be more effective to work with smaller working groups than with the entire website. On the English Wikipedia and other wikis there are WikiProjects: working groups dedicated to a certain subject area or task. However, looking at the WikiProject page is not necessarily a good indicator of the health of the community, and posting to the WikiProject's talk page may not get you the feedback you want. To more intelligently assess the size of a subject-area community and to determine who is in it, you will need to collect some metrics.
What is the solution?
WikiProject X, in its research on the English Wikipedia, came up with these metrics for WikiProjects:
- Number of pages in the WikiProject's scope
- Number of editors active in the WikiProject's scope (i.e. pages that are tagged by the WikiProject, plus talk pages)
- Number of editors active in the WikiProject's space (i.e. the WikiProject page, its talk page, and any subpage)
- Percentage of those active in the WikiProject's scope who are also active in the WikiProject's space—this is called the "engagement rate," since it measures how engaged Wikipedia editors are on the WikiProject
- Percentage of total editors active in the WikiProject's scope who are active in the scope. The higher the percentage, the more edits are made by repeat editors, suggesting the presence of a highly dedicated niche community. The lower the percentage, the more edits made by "drive-by" editors.
To make these measurements, you will want to set a cutoff to determine what makes someone active. For activity in the project scope, we define an active editor as making 5 edits to articles and/or talk pages in a rolling 30 day period. For activity in the project's space, we define an active editor as making 2 edits to the project space in a rolling 90 day period. You may want to adjust these numbers depending on your needs.
There is a WikiProject directory under development that will surface all of these metrics (except for engagement rate) for WikiProjects on the English Wikipedia. If you are interested in measuring activity for WikiProjects outside of the English Wikipedia, or if you want to measure subject-area activity for something other than a WikiProject (such as an arbitrary category of articles), the queries used are below. You could run these queries through a querying tool like Quarry.
Count of articles within the WikiProject scope:
SELECT count(page_title) FROM templatelinks LEFT JOIN page ON page_id = tl_from WHERE tl_namespace = 10 AND tl_title = "WIKIPROJECT_BANNER_TEMPLATE";
Count of edits to the WikiProject space:
SELECT COUNT(*) FROM revision_userindex LEFT JOIN page ON page_id = rev_page WHERE (page_namespace = 0 OR page_namespace = 1) AND page_title IN (SELECT page_title FROM templatelinks LEFT JOIN page ON page_id = tl_from WHERE tl_namespace = 10 AND tl_title = "WIKIPROJECT_BANNER_TEMPLATE") AND rev_timestamp > START_TIMESTAMP AND rev_timestamp < END_TIMESTAMP;
List of editors and edit count for edits to the project scope where editors made at least 5 edits:
SELECT rev_user_text,COUNT(*) FROM revision_userindex LEFT JOIN page ON page_id = rev_page WHERE (page_namespace = 0 OR page_namespace = 1) AND page_title IN (SELECT page_title FROM templatelinks LEFT JOIN page ON page_id = tl_from WHERE tl_namespace = 10 AND tl_title = "WIKIPROJECT_BANNER_TEMPLATE") AND rev_timestamp > START_TIMESTAMP AND rev_timestamp < END_TIMESTAMP GROUP BY rev_user_text HAVING count(*) >= 5 ORDER BY COUNT(*) DESC;
List of editors and edit count for edits to the project space where editors made at least 2 edits:
select rev_user_text,count(*) from page left join revision on page_id = rev_page where (page_namespace = 4 OR page_namespace = 5) and (page_title like "WIKIPROJECT_NAME/" OR page_title = "WIKIPROJECT_NAME") and rev_timestamp > START_TIMESTAMP and rev_timestamp < END_TIMESTAMP group by rev_user_text HAVING count(*) > 1 ORDER BY COUNT(*) DESC;
Things to consider
- The approach taken by this learning pattern assumes that you are looking for pages based on their being tagged by a WikiProject. You may want to, instead, do a query based on an arbitrary category of articles. For this approach, you will need to make sure you are including all the sub-categories of that category.
- When you find this information, don't keep it to yourself! Sharing information about who is editing what articles can be helpful for other editors. Of course, there should be an opt-out mechanism for people who do not want to be on your lists.
When to use
- WikiProject X project midpoint report
- Research projects that use these (or similar) metrics