Toolhub/Progress reports/2021-12-10

From Meta, a Wikimedia project coordination wiki

Report on activities in the Toolhub project for the week ending 2021-12-10.

Better patterns for waiting on user initialization[edit]

Tracked in Phabricator:
Task T296263 resolved

While working on improving user notifications related to URL registrations, Raymond discovered a race condition in getting the current user's authentication state. Our frontend application finds the user's authentication state and related information by calling the backend's GET /api/user/ endpoint. We call this API from the created lifecycle hook of the Vue application, cache the data in the browser, and treat that cached data as authoritative until our entire single-page app (SPA) is reloaded. This handles most common case needs for us, but there are components in the SPA which want to check the user's authentication state as soon as they are mounted. Because we support deep linking into the SPA, it is possible that a component like the form to register a new toolinfo record becomes mounted before the call to fetch the user's state has completed. By using Vue's ability to watch a variable for changes it was possible to make a component reactive to the API call completing, but it was not possible to just wait for the call before proceeding. This missing ability to block execution would make implementing some patterns much easier.

A blog post by Lukasz Tkacz provided inspiration for adding an ability for any component to wait until the initial GET /api/user/ request completes before continuing to run other code. This has now been done by making our getUserInfo() wrapper for making the API store the initial Promise emitted by our makeApiCall() helper function in the vuex store. getUserInfo() will return that same Promise directly on subsequent calls. With these changes in place, any component can now register additional code to be run only after the user's state has been fetched from the backend server. That might look something like:

this.$store.dispatch( 'user/getUserInfo', { vm: this } )
    .then( ( user ) => { /* Do something dependent on the user's state */ } );

The magic here comes from the call in line 1 of the example returning a Promise that is shared across all callers. If the promise has already been resolved execution will proceed immediately to the callback in line 2. If the promise is still pending (meaning the API call has not returned yet), execution will wait on the promise resolving first before continuing on to execute the callback.

Improved user notifications related to URL registrations[edit]

Tracked in Phabricator:
Task T294125 resolved

Magnus originally reported confusing error messages when attempting to register a URL with the crawler as phab:T294125 back in October. Investigating the issue showed that there were a number of related issues with notices and error handling in this area of the frontend application. URL registration was one of the earliest features in our Vue app. We have learned many things and invented new local conventions since that code was written, but we had not yet come back to apply that knowledge to URL registration. This has now been corrected so that more useful error messages are given when URL registration or deletion fails and also so that there is now a success notification when things happen as expected.

Delete tool orphan records when toolinfo.json URL is deleted[edit]

Tracked in Phabricator:
Task T294370 resolved

Raymond added cleanup code to remove toolinfo records orphaned by deleting the URL they were imported from.

Implement text analysis to support stemming[edit]

Tracked in Phabricator:
Bug 276865

Stemming is the process of reducing a word to its root form. This is commonly done when indexing and searching freeform text content to increase the chance of matching a document containing a word form that varies in tense or cardinality from the user's search terms.

Cardinality is an approachable way to think about this complex problem. If a user searches Toolhub for the plural English noun templates they are probably equally happy to find results where the toolinfo author used the singular English noun template. A savvy user can use wildcards to work around a lack of cardinality stemming in some languages (like English) by searching for template*. This type of workaround is limited however to suffix-based variations.

Lucky for us, Elasticsearch comes with tools to help with this problem. We are now configuring a custom analyzer which applies the Porter stemming algorithm when indexing and searching toolinfo records. This should improve searches using English words. As Toolhub improves its support for toolinfo records in other languages we may need to revisit this stemming choice as other languages have different stemming algorithms.

The improved search is now available on the demo server. Deploying this change to our production servers is going to take a bit more investigation and planning to find a reasonable way to run the commands needed to update the Elasticsearch index.