Toolhub/Progress reports/2022-03-18

From Meta, a Wikimedia project coordination wiki

Report on activities in the Toolhub project for the week ending 2022-03-18.

Release did not go smoothly, but we survive[edit]

Tracked in Phabricator:
Task T303889

While attempting to deploy a newer version of Toolhub to Wikimedia's 'staging' cluster, we ran into a fatal error that required a new python library to be installed. The django-prometheus library that Toolhub is using to produce some metrics for production reporting introduced a breaking change in their v2.2.0 release. SemVer (semantic versioning) states that breaking changes are indicated by increasing the major number (indicating high risk). If they were actually following SemVer, django-prometheus would have labeled this release v3.0.0. As it was they didn't even bother to list the breaking change in the release notes.

After changing the libraries and successfully deploying into the staging cluster we proceeded to deploy to the 'codfw' and 'eqiad' clusters. The eqiad cluster is the important one for Toolhub. This is where all of our production traffic is actually processed. Unfortunately we found T303889 in this environment and did not understand this was the problem until after we had applied database changes which would be difficult to rollback. At this point https://toolhub.wikimedia.org/ was returning an ugly HTTP 500 error page to all visitors, and had been doing so for nearly an hour.

The most expedient path back to a likely working deployment was to undo the changes from gerrit:770987 and also pin the django-prometheus library to an older version (2.1.0) which worked with the python-memcached that was also known to work with our mcrouter sidecar. Preparing the git commit to rollback gerrit:770987 and add pinning took more time than Bryan hoped. Poetry had some failures in trying to install python-memcached, but eventually worked. Approximately 1.5 hours after initially deploying the broken version of Toolhub to the eqiad cluster a working version was deployed.

Production release made[edit]

A number of feature enhancements and bug fixes were deployed to the production https://toolhub.wikimedia.org site on 2022-03-15. This was the first deployment of the production service since 2022-01-05.

Features added:

Bugs fixed:

Technical debt:

Features Added/In The Works[edit]

Tool Lists[edit]

Tracked in Phabricator:
Task T301917 resolved
  • Slavina refactored all components/views used to display toollist(s) information into extending from one base component.
  • Slavina submitted a patch making it easy to edit and delete a ToolList.

Annotations[edit]

  • Bryan made improvements to the Annotations feature by adding 14 new fields to annotations and connecting it all to elasticsearch.
  • Raymond started working on the UI side of the annotations, submitting a patch that refactored the tool edit form so more fields from annotations can be added without making UX worse.

Recent Changes[edit]

  • The Recent Changes feature patch by Raymond aimed at unifying the content moderation and patrolling of all lists and tools on toolhub on a single view has been merged.
  • Raymond submitted a couple of patches aimed at limiting the number of requests made while performing certain content moderation and patrolling actions like undoing an edit.

Multi-author support for Toolinfo records[edit]

Tracked in Phabricator:
Task T302143 resolved

Raymond and Bryan made improvements to the UI of the multi-author support for toolinfo records feature by submitting and merging a patch that allows for the display of extended author information on tool detail screen.


Bugs Fixed/ Development Improvements[edit]

Guard against missing prior values in revision diffs[edit]

Tracked in Phabricator:
Task T303657 resolved

Bug fix submitted by Bryan.

Delete list document from search index when list is unpublished[edit]

Bug fix submitted by Raymond.