Toolhub/Progress reports/2021-11-12

From Meta, a Wikimedia project coordination wiki

Report on activities in the Toolhub project for the week ending 2021-11-12.

Improvements to development environment[edit]

Tracked in Phabricator:
Task T295318 resolved

The docker-compose based development environment that we first designed all the way back in August 2020 was found to have accidentally taken advantage of features specific to Docker Desktop for Mac. Specifically Raymond found that the way that we were mounting the local git clone of Toolhub into the container did not allow the container to write to the mounted volume on a Linux host. The root problem was one of file permissions not allowing the "somebody" user that we ran inside the container to change files owned by the developer outside of the container. The Docker Desktop for Mac tooling somehow abstracts away this problem in its volume driver, but the Linux volume driver does not have a facility for remapping the users when crossing the volume boundary.

The general fix for this in the docker-compose world is to use a user runtime setting to tell the container to run with the effective user id and group id matching the external host. This fix is widely used and even documented for MediaWiki Docker development. Making this change to the Toolhub Docker setup was simple enough, but turned out not to solve all of our problems.

In development mode, Toolhub has multiple containers to host different parts of the application. One of these containers is a nodejs runtime which hosts a "live" version of the Vue frontend application. This development container shares a base with the container used as an internal build step for the production Toolhub deployment. That shared base runs npm install to install needed npm packages for compiling the Vue application. This is run as the "somebody" user that we are now replacing at development runtime with the uid and gid of the developer. That replacement changes the effective user inside the container, but did not replace the HOME directory of the user. As a result we ended up with another file system ownership and permission mismatch, this time inside the container itself rather than caused by an external mount. After many false starts and a long rubber duck debugging session on irc Bryan was finally able to find a work around by dynamically creating a new HOME as the first step in the runtime process so that file permissions work as expected. Thanks to Ahmon Dancy of the Foundation's Release Engineering team for acting as the rubber duck in this debugging!

A few other changes have also been made to keep the development environment compatible with the version of docker-compose that is available via Debian's apt repositories. Being cross-platform can be tricky work, but investing in making Toolhub as simple as possible to get running for local development is important to the project. We very much want to have code contributions from the Wikimedia technical community and any barriers we can reasonably remove to doing that are beneficial.

Auto-complete/lookahead search for adding tools[edit]

Raymond has started work on making list building easier by adding a custom search widget for finding the tools to add to a list.

Metrics about tools that Toolhub can surface[edit]

Team discussion this week included an initial discussion of metrics related to a particular tool that Toolhub may be able to collect and show to users. This early discussion was focused on Bryan sharing things that he has thought about collecting already. The first thing to realize is that we are cataloging different types of tools and that there will be variations in what we can and cannot do based in part on type. The types currently supported by our data model via the tool_type toolinfo attribute are: web app, desktop app, bot, gadget, user script, command line tool, coding framework, other. Where a tool is hosted will also have some affect on what metrics we can collect. Things that are on-wiki or run from Toolforge or Cloud VPS projects will have different opportunities for collecting empirical data than things which run elsewhere.

At this point Bryan thinks that we should be able to calculate rough numbers for some things:

  • Number of users with a given user script in their common.js configuration. See work currently done by SDZeroBot for the enwiki community for inspiration.
  • Number of users with a given gadget enabled on a wiki as reported by Special:GadgetUsage.
  • Daily "hits" (successful HTTP responses) as recorded by toolforge:toolviews/ for Toolhub web services
  • Number of authorizing users for an OAuth application
  • Number of edits attributed to an OAuth application via OAuth CID: ... edit tags.

Wrap up[edit]

Unblocking developers using Linux hosts to contribute to Toolhub and getting a first patch into Gerrit from our newest team member were nice accomplishments for the week. Knowledge sharing across the team with discussions like the one on tool metrics is something that we plan to do more of in the coming weeks and months. This is especially important as the team grows so that we can all be more informed in the work that we do and the solutions that we propose. Speaking of growing, next week we will be adding another developer to the team! We will save the reveal of who for next week's report, but this will complete the currently planned growth of the team and let us start thinking about things like updated team practices to improve communication and review of our work.