OpenSpeaks/Tools/Subtitler

| OpenSpeaks Subtitler | |
|---|---|
| Status | Beta (active development) |
| URL | subtitler.toolforge.org |
| Source code | gitlab.wikimedia.org |
| Licence | MIT |
| Platform | Wikimedia Toolforge |
| Designed by | Ranjith S |
| Part of | OpenSpeaks Tools |
| Discussion | Talk page |
OpenSpeaks Subtitler is a free and open source web tool for creating and editing subtitles for audio and video files, developed primarily for oral history and language documentation. It is hosted on Wikimedia Toolforge and allows downloading audio/video files from Wikimedia Commons as well as downloading, editing and uploading Commons' TimedText subtitles.
The tool is designed by Ranjith S for OpenSpeaks.
Background
[edit]Subtitling oral history and language documentation recordings needs a lot of time, labour and technical knowledge. Most subtitling tools are for broadcasting and work well with stable, fast internet connectivity. Most community documenters do not have fast internet connectivity during fieldwork. The need for a Wikimedia-native, offline-capable subtitling tool was identified during an earlier Rapid Grant project and documented in a research paper.
An alpha prototype was first published in late 2025. Ranjith S joined to lead development in early 2026, and a working beta version was published in March 2026 at the permanent URL subtitler.toolforge.org.
Purpose
[edit]OpenSpeaks Subtitler lets archivists, language documenters, and Wikimedians:
- Create subtitles for audio/video files — kept in your own device and on Wikimedia Commons
- Use automated silence detection to generate draft subtitle segments quickly
- Edit timings and text in a visual, waveform-based interface
- Export finished subtitles in SRT or VTT (standard web subtitling formats) with correct UTF-8 encoding
- Upload subtitle files directly as TimedText to Wikimedia Commons without leaving the tool
It is particularly suited to low-resourced language content, where no automatic speech recognition is available and manual transcription is the primary workflow.
Features
[edit]Loading audio/video
[edit]- Upload local video or audio files (MP4, MOV, MKV, MP3, WAV, OGG, and other common formats).
- Search for and load a file directly from Wikimedia Commons by filename—existing subtitles are also fetched (select correct language from dropdown/by typing and press "Load").
- Choose between different available resolutions/transcodes for Commons videos to suit available bandwidth.
Marking subtitle parts (optional)
[edit]You can manually mark every segment or use the "Generate" option to automatically split the spoken portion into draft subtitle chunks. Two parameters are adjustable:
- Silence threshold (in dB): how quiet a passage needs to be to count as a gap.
- Minimum silence length (in milliseconds): shortest pause that will trigger a split.
- End buffer (in milliseconds): adds extra time at end
This provides a first draft for quick correction. You can also manually mark the beginning and end of each subtitle part if you prefer.
Editing interface
[edit]The editor shows a waveform timeline right below the video:
- Waveform: you can see where the audio is loud and where it is quiet. Quietest parts indicate pauses—sentences are often broken here. Existing subtitle regions are shown in green.
- Controls: You can click on the
-to zoom out,FITto see the entire duration and+to zoom in.
- Controls: You can click on the
- Subtitle cards: each subtitle segment appears as a card showing start time, end time, duration, and editable text. Clicking a timestamp jumps the playhead to that moment.
- On-video text overlay — subtitle text is shown live on the video/audio while playing it.
- Loop mode: enable looping on any subtitle card to replay that segment repeatedly while you write the subtitle; a visible icon shows when a loop is running, and it can be cancelled with L.
- Real-time sync: text changes are saved instantly to the local project state; no manual save step is needed.
Keyboard shortcuts
[edit]TBD
Language selection
[edit]You can type or select from a dropdown of the full ISO 639 language list to tag the subtitle file's language before export. This metadata is preserved in the SRT/VTT file and used when uploading to Commons.
Export
[edit]- Export as SRT or VTT to computer or upload to Wikimedia Commons as TimedText.
- All output is UTF-8 encoded, supporting scripts across all languages.
Wikimedia Commons sync
[edit]- Log in with your Wikimedia account via OAuth 2.0 — no separate password is stored by the tool.
- Once logged in, pull audio/video as well as subtitles into the tool, create new/edit existing subtitles, and upload finished subtitles directly to Commons.
Light and dark mode
[edit]The interface supports a full light/dark mode toggle, designed to reduce eye strain during long editing sessions.
Technical architecture
[edit]| Layer | Technology | Role |
|---|---|---|
| Frontend | Vue 3 + Vite + Tailwind CSS | Reactive UI |
| State management | Pinia | Subtitle, auth, and theme state |
| Audio/video processing | FFmpeg.wasm (Web Worker) | Client-side silence detection; runs off the main thread to keep the UI responsive |
| Backend | Flask (Python) | REST API and static file serving |
| Authentication | Authlib (OAuth 2.0) | Wikimedia login |
| Database | MariaDB + SQLAlchemy | User sessions and project metadata |
| Deployment | Wikimedia Toolforge | Hosting |
Using the tool
[edit]- Open subtitler.toolforge.org in any modern browser (Chromium-based browsers are recommended for full feature support).
- Load your media: either upload a local file or enter a Wikimedia Commons filename to load it directly.
- Generate a draft: press "Generate" for using default settings (works for most speech) or press gear icon
to adjust settings to suit your recording, and generate initial subtitle segments. - Edit: work through each subtitle segment, typing the transcription. Adjust start/end times using waveform if needed.
- Select language — choose the correct ISO 639 language code from the dropdown.
- Export or upload — download the SRT/VTT file locally, or log in with Wikimedia and upload directly to Commons.
Roadmap
[edit]The following features are planned for future releases:
- Multi-track subtitle support (e.g., original language + translation on the same file)
- AI-powered speech-to-text integration for draft generation
- Direct Export to Commons with full file metadata fields
- Collaborative real-time editing
Get involved
[edit]- Testing and feedback
- Try the tool at subtitler.toolforge.org and share your experience on Discussion page what worked, what did not, and what you needed that was missing.
- Bug reports and feature requests
- File issues directly on the GitLab repository: gitlab.wikimedia.org/toolforge-repos/subtitler or here.
- Code contributions
- The repository is public and accepts merge requests. See the README for local development setup instructions.