Toolhub/Progress reports/2021-03-05

From Meta, a Wikimedia project coordination wiki

Report on activities in the Toolhub project for the week ending 2021-03-05.

Direct tool registration[edit]

Tracked in Phabricator:
Task T195682

Srishti's User interface for creating a toolinfo record patch is merged! She is now ready to start working on implementing editing of existing toolinfo records which will expand the data that can be managed through our UI to include the complete set of toolinfo attributes.

Faceted search[edit]

Srishti has done an initial review of Bryan's search UI patch. Bryan will follow up on that feedback, and if things go well we have a good chance of having fully functional search merged by next week's report.

Two "volunteer" patches merged![edit]

Patches from ערן (gerrit:657951) and Reedy (WMF) (gerrit:668518) were merged this week. Bryan thought these were the first code contributions from outside the core dev team, but when checking to confirm was reminded that Amire80 contributed gerrit:651912 in December 2020. In any case, we have volunteer contributions and that is a great thing.

Crawler tweaks[edit]

gerrit:668248 updated core logic for the crawler to improve search results and help ingest more external records.

We are now converting all free form keywords found the toolinfo.json data to lowercase. This has been done to normalize the various uses of "wikipedia"/"Wikipedia" and similar trivial case-based difference in this data. This free form keyword functionality is something that we hope to replace with structured vocabularies in the future, but for now we are trying to make the best use we can of the data from the legacy Hay's Directory toolinfo.json standard.

We are also now stripping out any unexpected fields that might be present in a toolinfo.json file that we crawl. We have a JSON Schema to validate against, but in practice we are currently more concerned about getting data into the system than we are about enforcing strict validation. The need for this stripping is obvious in retrospect, but its addition was prompted by testing loading of all of the source inputs currently crawled by Hay's Directory.

We are likely to spend some more time before launch on additional data cleaning to make the best use we can of the current data.

End user documentation[edit]

Bryan joined SRodlund (WMF) and APaskulin (WMF)'s fortnightly "Documentation office hour" meeting to talk about ideas for organizing end user documentation. Alex shared mw:Selenium as a help landing page that she liked. No new pages have been created on meta yet for this, but soon!

Production deployment preparation[edit]

Our planned June 2021 production deployment is still several months away, but we are trying to think about what features the application needs to have in addition to the end user facing features to support that launch. This week Bryan spent a little time looking into Add /healthz system status check endpoint as a feature needed for better Kubernetes integration. After chatting with several members of the Foundation's SRE team, this particular feature looks to be much simpler than Bryan was trying to make it. :)

Wrap up[edit]

With toolinfo creation now possible through the UI and search nearing completion things are looking pretty good for our January-March goals. There is still quite a bit of UI work to be done to implement screens for editing a full toolinfo record, viewing toolinfo edit history and diffs, and performing reverts and undos. We hope that this is manageable in our remaining time. It is hard to say today if we will have 100% of this work {{Done}} by the last day of March, but it seems likely that we would complete the stragglers within the first weeks of April if not.