AFFECTIVE TTS using this phenomenon. Synthesize speech from .txt
or .srt
and overlay it to videos / picture.
- Has 134 affective voices for English tuned for StyleTTS2. Supports single-voice foreign languages TTS via MMS.
- A Beta Version of this tool for TTS & audio soundscape is build here
virtualenv --python=python3 ~/.envs/.my_env
source ~/.envs/.my_env/bin/activate
cd shift/
pip install -r requirements.txt
Demo. Result saved as demo_affect.wav
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=./hf_home CUDA_VISIBLE_DEVICES=0 python demo.py
Flask api.py
on a tmux-session
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=./hf_home CUDA_VISIBLE_DEVICES=0 python api.py
Examples below need api.py
to be already running. If api.py
runs on a different machine, use the IP shown in the terminal of api.py
.
Text To Speech
# Basic TTS - See Available Voices Above - saves .wav in ./out
python tts.py --text assets/LLM_description.txt --voice "en_US/m-ailabs_low#mary_ann"
Native Voice to English (Affective) TTS
python tts.py --voice "en_US/m-ailabs_low#mary_ann" --video assets/anbpr.webm --text assets/anbpr.en.srt
Native voice To Romanian TTS
python tts.py --voice romanian --video assets/anbpr.webm --text assets/anbpr.ro.srt
Native Voice To English (Affective) TTS
python tts.py --voice "en_US/vctk_low#p306" --text assets/head_of_fortuna_en.srt --video assets/head_of_fortuna.mp4
Img To Speech
# Video narrating an image
python tts.py --text assets/LLM_description.txt --image assets/image_from_T31.jpg --voice "en_US/cmu-arctic_low#jmk"