Getting Started

Get VideoCaptioner up and running on your machine in minutes. This guide covers installation, basic setup, and your first subtitle generation.

System Requirements

PlatformMinimumRecommended
WindowsWindows 10 (64-bit)Windows 11
macOSmacOS 10.15+macOS 12+
LinuxUbuntu 20.04+ / Debian 11+Ubuntu 22.04+
Python3.10+3.11+
RAM4 GB8 GB+ (for local Whisper)

Installation

Option 1: pip install (Recommended)

The simplest way to get started. Install the CLI tool directly:

Terminal
pip install videocaptioner

For the GUI desktop application:

Terminal
pip install videocaptioner[gui]

Option 2: Windows Installer

Download the standalone installer (~60 MB) from the GitHub Releases page. All dependencies are bundled — just install and run.

Option 3: macOS / Linux Script

Terminal
git clone https://github.com/WEIFENG2333/VideoCaptioner.git
cd VideoCaptioner
chmod +x run.sh
./run.sh

The script auto-detects your Python environment, creates a virtualenv, installs dependencies, and checks for FFmpeg and aria2.

Option 4: Manual Setup

macOS (Homebrew):

brew install ffmpeg aria2 [email protected]
git clone https://github.com/WEIFENG2333/VideoCaptioner.git
cd VideoCaptioner
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
python main.py

Ubuntu / Debian:

sudo apt update && sudo apt install ffmpeg aria2 python3-venv
git clone https://github.com/WEIFENG2333/VideoCaptioner.git
cd VideoCaptioner
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
python main.py

CLI Commands

Once installed, the videocaptioner command is available globally:

CommandDescription
transcribeSpeech-to-subtitle. Supports faster-whisper, whisper-api, bijian (free), jianying (free)
subtitleOptimize & translate subtitles via LLM, Bing (free), or Google (free)
synthesizeBurn subtitles into video file
processEnd-to-end: transcribe → optimize → translate → synthesize
downloadDownload videos from YouTube, Bilibili, etc.
configManage settings (show, set, get, path, init)

Your First Subtitle

The fastest way to generate subtitles — no API key needed:

Terminal
# Transcribe using free Bijian ASR
videocaptioner transcribe video.mp4 --asr bijian

# Translate to English using free Bing
videocaptioner subtitle output.srt --translator bing --target-language en

# Or do everything in one command
videocaptioner process video.mp4 --target-language en

Tip: Free vs API-powered

You can use VideoCaptioner completely free with Bijian ASR + Bing translation. For higher quality results, configure an LLM API (costs <$0.002 per 14-min video). See the LLM Configuration guide.

GUI Desktop Application

Launch the graphical interface by running videocaptioner without any arguments:

videocaptioner

The GUI provides a visual workflow with drag-and-drop, subtitle preview, and one-click processing.

Basic Configuration

View your current configuration:

videocaptioner config show

Set a value:

videocaptioner config set llm.api_key sk-your-key-here
videocaptioner config set llm.api_base https://api.openai.com/v1
videocaptioner config set llm.model gpt-4o-mini

Configuration priority

CLI arguments > Environment variables (VIDEOCAPTIONER_*) > Config file > Defaults

What's Next