Community Tech/Page Curation and New Pages Feed improvements

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search
Work in progress icon.svg

This page documents a project currently under development by the Wikimedia Foundation's Community Tech team.
We invite you to join the discussion on the talk page.


This project was proposed in the Community Wishlist 2019 and was voted #1 with 157 votes. Community Tech team has committed to addressing as many of the project goals as possible.
In 2018, the Community Tech and Growth teams worked on a project aimed at general improvement of the AfC process. For more information, see the project page and background research for the project.

Problem statement[edit]

The wishlist proposal presents a broad goal of improving the New Page Review process and enlists key phabricator tickets that are important for the project. These tickets were prioritized and deemed important by the NPP community. These are listed below (in no special order).

Task title Phabricator link Notes
Redirects with RfD tags should still display in the New Pages Feed as 'Nominated for deletion' task T157046 Yes check.svg Done
'Potential Issues' flagged in Page Curation Toolbar Page Info flyout task T207847 Yes check.svg Done
Allow filtering by no citations in page curation task T169120 Yes check.svg Done
Send Message to creator without needing to 'unreview'/'re-review' the article task T207442 Yes check.svg Done
Page curation adds text to first deletion discussion page if it already exists task T169441 Yes check.svg Done
Implement addition of un-redirected pages to Special:NewPages and Special:NewPagesFeed task T92621
Redirects converted into articles should appear in the New Pages Feed indexed by the date of creation and creator of the article, not of the redirect task T157048 NoN Not fixable
Adding a "Potential COI" alert to the feed task T207757 A subset of features is proposed over task T233115
Add "previously deleted" as a possible issue (flagged in red) in the New Pages Feed/Page Curation Tool task T189929 Yes check.svg Done
Allow filtering by date range in Special:NewPagesFeed task T167475 Yes check.svg Done
Special:NewPageFeed - add option to filter by pageviews task T207238 Alternative proposed
Keyword Search for New Pages Feed task T207761
Enable page curation tools to be loaded on any page (optionally) task T207485 In development
Reviewer Notes system in Page Curation Tools: system for reviewers to flag talk page comments on new pages to other reviewers task T207452 Yes check.svg Done
Tagging Feedback in Page Curation Tools should also be sent to talk page task T207443 Yes check.svg Done
Page Curation Tools to add userspace CSD Log/PROD Log functionality task T207237
Dragable Corners on Page Curation toolbar windows (for resizing) task T207439 Yes check.svg Done
Page Curation toolbar: do not mark pages as 'reviewed' when adding CSD and PROD tags task T208685 Yes check.svg Done
Make PageTriage wiki agnostic task T50552 NoN Rejected

Status updates[edit]

August 20, 2019[edit]

It’s been a few months, and we’re excited to post some updates. We’ve been continuously working on Page Curation & New Pages Feed improvements, and the team has made solid progress. With that in mind, we’ll share some recent news ⁠— both highlights and challenges — at this stage in the process. We look forward to your feedback on the Talk page. Thank you!

Work We’ve Completed So Far[edit]

We’ve been updating the project page table with the requests marked as “Done.” Before this update, we had 5 requests completed (T189929, T169120,  T207439, T208685, and T157046). We now have a few more completed items:

Work That is Almost Complete[edit]

Work that Presents Challenges[edit]

  • T50552: Make PageTriage wiki agnostic: We’ve discussed this request, and we unanimously feel that it’s beyond our scope. Here’s why (according to analysis from the engineering team): PageTriage, while a useful extension, is written in a way that’s completely based on English Wikipedia processes. In order to convert the extension to work on other wikis, the extension would need to be adjusted — not only for other processes, but also to have a configurable process definition that each wiki could define for itself, based on each community’s needs. Consequently, this request would require a slew of analyses and decisions, such as: what it means to tag an article for deletion (e.g. what pages messages goes to, what templates are used, if there are follow-ups the system should be aware of, etc), the way we tag articles, which articles show up in the queue, and more. Moreover, we couldn’t easily trim down the scope by disabling some features. The internal workings of the extension are deeply intertwined with English Wikipedia. We would still need to do a significant amount of development work to ensure that the behavior remained stable and useful to other wikis. For these reasons, this request is unfortunately too big, so we cannot take it.
  • T207238: Special:NewPageFeed - add option to filter by pageviews and the associated spike: T225169: [4 hours] Investigate whether it's efficient to order by tag value (DBA input requested): This work presents significant challenges, but there may be an alternative solution.
    • First, the challenges (according to analysis from the engineering team): In order to filter/sort by inputted numbers, the numbers must be stored in the database in a specific manner. This first step alone would take several weeks, if not months, according to the estimates provided by Wikimedia database experts. Then, we would need to populate the sortable cells with pageview data, which comes from an external service. To do this, we would need to create a process that pulls the data from the external service and stores it in MediaWiki’s PageTriage table. Then, we would do this work repeatedly, so that the numbers would remain up-to-date, over the entire PageTriage database (which consists of tens of thousands of rows, if not more). This process is both uncommon (in MediaWiki servers) and complex; we would need to define this process and identify the correct way to implement it, in collaboration with Operations and Database experts. In total, we do not find the request, in its current form, within our scope. For more details on the technical analysis and discussion with the database administrators, you can check out the associated investigation ticket.
    • Second, the alternative solution (as described in the T225169 investigation): We could display the number of pageviews in the article record, without allowing for sorting or filtering. Would this be a satisfactory alternative to the community? And, if so, how would you like the number of pageviews displayed (e.g. average per day, median per day, total views in the last 30 days, etc)? Note that the results displayed will be from 24 hours earlier than the display time, and we’ll want to query from a maximum of 30 days ago (for the sake of general efficiency and manageability of this feature). We do not yet know if we can do this work — but, if we could, would it be worth our time and effort, in your opinion?

Requesting Feedback[edit]

We want to know your thoughts. Please let us know your thoughts on the Talk page. Thank you!

April 30, 2019[edit]

The Community Tech team has kicked off development work on this project. You may follow progress on the project tickets by looking at the phabricator board. I will also be updating the ticket status in the table above as things progress.

5 February, 2019[edit]

This project is in its early stages of research to investigate project goals, dependencies and potential roadblocks. Your feedback is welcome on the talk page.

12 March, 2019[edit]

We are beginning to assess technical feasibility of tickets prioritised in the wishlist proposal. The technical work on this project is slated to start in late April/early May.

Important links[edit]