Skip to content

Latest commit

 

History

History
86 lines (64 loc) · 3.52 KB

README_EN.md

File metadata and controls

86 lines (64 loc) · 3.52 KB

English | 简体中文

SubErase-Translate-Embed

Project Overview

SubErase-Translate-Embed is an open-source tool designed to enhance the accessibility of multilingual video content. By integrating OCR technology, subtitle erasure, translation, and embedding functions, this tool automatically processes subtitles in short films, enabling users to easily experience short film content in different languages.

This project provides a one-stop solution for users who wish to translate videos into multiple languages and re-embed the subtitles. It is widely applicable in scenarios such as multilingual education, international film production, and global audience entertainment experiences.

Demo

Key Features

  • Subtitle Recognition: Uses OCR technology (based on PaddleOCR) to extract subtitles from videos.
  • Subtitle Erasure: Automatically erases the original subtitles in the video using STTN (Spatio-Temporal Trajectory Network).
  • Subtitle Translation: Utilizes OpenAI's ChatGPT API or other translation services to translate the extracted subtitles into the target language.
  • Subtitle Embedding: Re-embeds the translated subtitles into the video, generating a new multilingual version.

Installation Guide

To use SubErase-Translate-Embed, follow these steps:

  1. Clone the project code:

    git clone https://github.com/chenwr727/SubErase-Translate-Embed.git
    cd SubErase-Translate-Embed
    git clone https://github.com/researchmm/STTN.git
  2. Install dependencies:

    conda create -n ste python=3.10
    conda activate ste
    pip install paddlepaddle-gpu==2.6.1.post120 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
    pip install -r requirements.txt
  3. Download models:

    Save the model files to the ./models directory with the following structure:

    models
    ├── ch_PP-OCRv4_det_server_infer
    └── ch_PP-OCRv4_rec_server_infer
    └── sttn.pth
    
  4. Configuration:

    cp config-template.yaml config.yaml
  5. Application Installation:

    sudo apt install imagemagick
    conda install -c conda-forge ffmpeg
    conda install -c conda-forge gcc=12.2.0

Usage

Execute video processing with the following command, automatically recognizing, erasing, translating, and embedding subtitles:

python main.py --video input_video.mp4 --language English

Where input_video.mp4 is the name of your video file, and English is the target translation language.

Project Structure

  • main.py: The main program entry point, responsible for managing the entire processing workflow.
  • modules/: Contains various functional modules (OCR, subtitle erasure, translation, embedding).
  • utils/: Contains general tools, such as logging and video processing utilities.
  • config.yaml: Configuration file for setting language, video format, and other parameters.

References