Toolhub/Progress reports/2021-03-12

From Meta, a Wikimedia project coordination wiki

Report on activities in the Toolhub project for the week ending 2021-03-12.

Faceted search[edit]

Tracked in Phabricator:
Task T195680 resolved

Bryan and Srishti ended up doing a bit of Pair programming to resolve the remaining visual consistency questions for the initial search UI. With this patch merged we now have a fully functional search pipeline which indexes each toolinfo record as it changes in the database, an API for exploring the indexed toolinfo documents, and a responsive UI for calling the API and viewing the results.

The current implementation computes facet information for these toolinfo fields with each search:

  • tool_type
  • for_wikis
  • author
  • license
  • ui_language
  • keywords

The API exposes a corresponding pair of term and null filters for each facet to make refining searching using the facet data easier. We expect to add additional facets in the future as the data associated with a toolinfo record expands.

Facet search was one of the three named goals for our work on Toolhub in the January-March quarter.

Direct tool registration[edit]

Srishti has a work in progress patch for editing a full toolinfo record. She and Bryan spent some time discussing it this week and are working together to find some nicer patterns for building complex forms and connecting them to the backend API.

New and improved demo server[edit]

The addition of #Faceted search meant that our demo server needed some attention. The initial demo server was a single Docker container using an ephemeral sqlite database. This made things very simple on the server itself, but also meant that testing data was lost each time the demo container was restarted. Adding the Elasticsearch backend dependency meant that we needed to make some changes, at the very least to provision a working single node Elasticsearch cluster.

We have built a new and improved demo stack on toolhub-demo01.toolhub.eqiad1.wikimedia.cloud and connected to the https://toolhub-demo.wmcloud.org/ proxy. The new system is using docker-compose to manage containers for the Toolhub runtime, MariaDB 10.4, and Elasticsearch 6.8.14. A custom systemd unit is used to start/stop the docker-composed managed collection of Docker containers. Additionally a Makefile provides convenient commands for manually triggering a restart, updating the Toolhub container image, and watching logs.

Wrap up[edit]

We have search! With this functionality in the core product we have reached and exceeded feature parity with Hay's Directory. Bryan has been doing some comparison searches using the two tools and found that the results are comparable. In some instances Hay's Directory will return more results than Toolhub, but thus far each of these extra results has proven to be information for a tool that is no longer active on Toolforge. This also marks completion of the second of three named goals for our January-March quarter.

Work is likely to continue on #Direct tool registration through the end of March, and possibly into the early part of April. We will be spending some time in the coming weeks to prioritize the planned work remaining to reach our "1.0" feature targets.