English | 简体中文
SubErase-Translate-Embed is an open-source tool designed to enhance the accessibility of multilingual video content. By integrating OCR technology, subtitle erasure, translation, and embedding functions, this tool automatically processes subtitles in short films, enabling users to easily experience short film content in different languages.
This project provides a one-stop solution for users who wish to translate videos into multiple languages and re-embed the subtitles. It is widely applicable in scenarios such as multilingual education, international film production, and global audience entertainment experiences.
- Subtitle Recognition: Uses OCR technology (based on PaddleOCR) to extract subtitles from videos.
- Subtitle Erasure: Automatically erases the original subtitles in the video using STTN (Spatio-Temporal Trajectory Network).
- Subtitle Translation: Utilizes OpenAI's ChatGPT API or other translation services to translate the extracted subtitles into the target language.
- Subtitle Embedding: Re-embeds the translated subtitles into the video, generating a new multilingual version.
To use SubErase-Translate-Embed, follow these steps:
-
Clone the project code:
git clone https://github.com/chenwr727/SubErase-Translate-Embed.git cd SubErase-Translate-Embed git clone https://github.com/researchmm/STTN.git
-
Install dependencies:
conda create -n ste python=3.10 conda activate ste pip install paddlepaddle-gpu==2.6.1.post120 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html pip install -r requirements.txt
-
Download models:
Save the model files to the
./models
directory with the following structure:models ├── ch_PP-OCRv4_det_server_infer └── ch_PP-OCRv4_rec_server_infer └── sttn.pth
-
Configuration:
cp config-template.yaml config.yaml
-
Application Installation:
sudo apt install imagemagick conda install -c conda-forge ffmpeg conda install -c conda-forge gcc=12.2.0
Execute video processing with the following command, automatically recognizing, erasing, translating, and embedding subtitles:
python main.py --video input_video.mp4 --language English
Where input_video.mp4
is the name of your video file, and English
is the target translation language.
- main.py: The main program entry point, responsible for managing the entire processing workflow.
- modules/: Contains various functional modules (OCR, subtitle erasure, translation, embedding).
- utils/: Contains general tools, such as logging and video processing utilities.
- config.yaml: Configuration file for setting language, video format, and other parameters.