User talk:BDavis (WMF)

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

tools forge pip[edit]

Dear Bryan,

I want to convert jpg to pdf for uploading to Wikimedia Commons. The convert command in Linux always breaks down and I want to use img2pdf (pip install img2pdf). However, pip is not installed. Is there a way to install pip and img2pdf? Do I have to get a super user rights?--Wmr-bot (talk) 10:37, 5 September 2019 (UTC)

@Wmr-bot: you need to create a Python virtualenv environment so that you can use pip local to that environment rather than global to the virtual machine you happen to be using. We do not seem to have really great documentation on doing this in a general case, but wikitech:Help:Toolforge/Web#Using_virtualenv_with_webservice_shell covers a specific way to setup a virtualenv that works with Python3 webservices. --BDavis (WMF) (talk) 15:01, 5 September 2019 (UTC)
Thanks. I will try.--維基小霸王 (talk) 00:22, 7 September 2019 (UTC)
It works! --維基小霸王 (talk) 14:26, 7 September 2019 (UTC)

Although there are 8 cores in the server, it runs much slower then my own laptop. Any idea on how to speed it up? I am already using multiprocessing in python to speed up.--維基小霸王 (talk) 12:15, 22 September 2019 (UTC)

@維基小霸王: The 8 cores are shared with a number of other processes (how many depends on where your job is running and what other workloads are running at the same time). Your tool's $HOME is also located on an NFS server that is handling I/O for all other Toolforge tools at the same time. This access is rate limited on each exec node in an attempt to prevent the NFS server's disks from being monopolized by a single tool. Things that need to read/write to $HOME will be slower than any single user system like a local laptop by design.
One of the main things to think about is how to make it so that your tool can do the work it needs to do slowly, but continuously rather than quickly and immediately. That is a more "scalable" way to think about building tools. --BDavis (WMF) (talk) 21:32, 23 September 2019 (UTC)
That present me a challenge because I plan to generate PDFs from TBs of jpegs. I will try..--維基小霸王 (talk) 07:02, 24 September 2019 (UTC)