Community Wishlist/Wishes/Make video2commons subtitle import work again & add subtitles to earlier video imports
Description

Wikimedia Commons as of June 2025 has over 350 000 videos, all integrated into one system where they can be searched and discovered and where they are well organized into categories.
A main way these videos get uploaded is by using the video2commons tool (over 110 000 videos have been uploaded with it) where one more or less only needs to enter a youtube URL (instead of for example having to first convert the video to webm with ffmpeg
and then manually enter things in the UploadWizard).
However, since around October 2019, the subtitles import doesn't work anymore. This means that when checking "Import subtitles" (see image), it doesn't actually import the subtitles to the Commons subtitles, called TimedText, even when they are there at the source at YouTube (or some of the other video sites that work with the tool).
- The tool could download the subtitles with yt-dlp and add them to the TimedTexts which would require some change to video2commons. For example,
yt-dlp --list-subs url
shows a video's subtitles andyt-dlp --write-subs en --sub-format srt url
downloads just the subtitles. Conversion to srt if not available can be done withffmpeg -i input.vtt output.srt
or something similar. The main GitHub issue about it is here: https://github.com/toolforge/video2commons/issues/148 - It needs some tool that does it for videos imported with video2commons retrospectively because now many files miss subtitles (example on the right). Commands like those above could be used for that and it could be a script that does that and creates those TimedText pages.
- Doing something like
ffmpeg -i video.webm -map 0:s:0 subtitles.srt
on the Commons Wikimedia server for video files and then pasting that into TimedText pages (maybe with a needs-checking category) would import the subtitles for those videos that have the subtitles embedded in the video file (maybe that would suffice – I don't know if that's the case for most of these videos).
- Doing something like
This could be the biggest current issue in video2commons (prior talk discussions: 1 2 3).
There are some further big issues with the tool that would also be important and useful to address, such as enabling import of video chapters, preventing duplicate uploads, parsing of bandcamp & soundcloud license tags, auto-addition of categories, asking the user to specify video language, and the need for some script to fix the many videos (at least the used ones) that were imported at low-resolution due to an earlier bug.
This problem is a big issue with those who don't understand the language of videos but benefit from the video, e.g. English videos are often also used in other language Wikipedias and people browsing Commons for videos often get search results or categories not subdivided by video language where such subtitles are needed.
Assigned focus area
Unassigned.
Type of wish
Bug report
Related projects
Wikimedia Commons
Affected users
Commons users and people watching free-licensed videos on Wikipedia that is not in their language or who are deaf
Other details
- Created: 12:37, 7 June 2025 (UTC)
- Last updated: 12:54, 7 June 2025 (UTC)
- Author: Prototyperspective (talk)