Jump to content

OpenSpeaks/Tools

From Meta, a Wikimedia project coordination wiki

The OpenSpeaks open-source tools are offline-first and for language archivists, including Wikimedians, to support them in documenting languages. All code will be licensed under a permissive open-source license (e.g., MIT or GPL), fully public, and free to use. The focus is simplicity, offline capability, and cross-platform access (desktop-first, browser-based where possible). Internet is required only for optional upload functions. The need for these tools within the community was identified during a previous Rapid Grant-supported project and was documented in this research paper.

Timeline: July 2025–June 2026

Tools
Prototype of OpenSpeaks Subtitler in action

Purpose

Create and edit subtitles quickly, especially for oral history and language documentation materials.

Core Features

  1. Load Media
    • Accepts common video/audio formats (MP4, MOV, MP3, WAV, etc.).
    • Offline loading with HTML5-based player.
  2. Dummy Subtitle Generation
    • Parameters:
      • Silence threshold (in dB).
      • Minimum silence length (in milliseconds/seconds).
    • Automatically segments media into draft subtitle chunks.
  3. Manual Editing Interface
    • Full timeline view + zoomed-in view for current subtitle (with 2–3 preceding/following subtitles visible).
    • Editable text overlay directly on video player.
    • Real-time update of subtitles in playback.
  4. Language Selection
    • Dropdown with ISO 639 language list.
  5. Export
    • Save as SRT or VTT.
    • UTF-8 encoding.
  6. Optional Wikimedia Upload
    • Login via OAuth using MediaWiki credentials.
    • Upload generated subtitles to Wikimedia Commons.

Tech Suggestions

  • Frontend: HTML5, JavaScript (possibly React or vanilla JS), Web Audio API for silence detection.
  • Backend: None (pure client-side, except Wikimedia API calls).
  • Media handling: ffmpeg.wasm for browser-based media analysis.
Update

Media Metadata Viewer & Compress Helper

[edit]
Screenshot of a prototype of OpenSpeaks Media Size Optimizer (see working prototype

Purpose

Quickly inspect media properties and compress files for sharing/editing.

Core Features

  1. Load Media & Display Metadata
    • Key data points:
      • Duration
      • Resolution (video)
      • Frame rate (fps)
      • Audio sample rate (Hz)
      • Bitrate
      • Codec info
  2. Compression Targeting
    • User enters desired output file size.
    • Tool calculates required bitrate for re-encoding.
  3. Compression Execution
    • Option to export compressed file.
    • Use ffmpeg.wasm or local ffmpeg wrapper.

Tech Suggestions

  • Frontend: HTML5/JavaScript UI.
  • Compression: ffmpeg.wasm or native ffmpeg calls in Electron.
See working prototype

Media Duration Calculator

[edit]
Prototype of OpenSpeaks Folder Media Analyzer

Purpose

Batch calculate total media duration of audio and video files inside folders for project planning/budgeting.

Core Features

  1. Folder Input
    • User points to a folder containing media files.
  2. Duration Summary
    • Outputs:
      • Total duration of all media.
      • Total duration of audio-only files.
      • Total duration of video-only files.
  3. Output Format
    • Display results on screen.
    • Option to export CSV or plain text report.

Tech Suggestions

  • Node.js or Python backend for folder scanning + ffprobe (from FFmpeg) for duration extraction.
  • Simple HTML/JS frontend.
Test working prototype

Multimedia Organization Tool

[edit]
Prototype of Multimedia Organization Tool

Purpose

Organise, categorise, tag, and batch-rename multimedia files (video, audio, image) inside a folder using structured naming conventions for production workflows.

Core Features

  1. Folder Input
    • User selects a local folder containing media files via the File System Access API.
    • Scans and lists all video, audio, and image files.
  1. File Categorisation & Tagging
    • Assign categories (e.g., A-roll, B-roll) to files — categories are editable.
    • Apply supplementary tags (e.g., Establishing, Close-up) independently of categories.
    • Filter file list by category or tag.
    • Inline notes per file within the file list view.
  1. Media Preview
    • Built-in player for video, audio, and image preview.
    • Displays current filename during playback.
    • Audio files are visually distinguished with a different colour.
  1. Naming Convention Builder
    • Configurable pattern: {language}-{type}-{names}-{subject}_{sequence}.{ext}
    • Language prefix (e.g., en, hi).
    • Type mapped to category (e.g., a for A-roll, b for B-roll).
    • Speaker/participant names joined with hyphens; spaces converted to underscores.
    • Subject/topic in Sentence Case with spaces as underscores.
    • Auto-appended sequential numbering (_01, _02, etc.) to prevent overwrites.
    • Example: en-a-Speaker_1-Speaker_2-Subject_01.mov
  1. Export & Rename
    • Export organisation log as CSV or TSV with old name, new name, notes, category, type, and size.
    • Batch rename files directly in the filesystem (requires browser permission).

Tech Stack

  • Self-contained single HTML file — works fully offline.
  • File System Access API for local folder access (Chromium-based browsers).
  • Vanilla HTML, CSS, and JavaScript — no external dependencies.
  • Also available as React/TypeScript app (Vite + Tailwind CSS + shadcn/ui).
Test working prototype

Needs

[edit]
  • Tool to ingest metadata in accepted formats by other archives and export Wikimedia Templates

Inspirations

[edit]
  • LinguaLibre Dictionary for creating a dictionary using recordings on Commons: Excellent way to organise community-based recording sessions for those interested in building a dictionary