Skip to content

Latest commit

 

History

History
84 lines (51 loc) · 2.69 KB

README.md

File metadata and controls

84 lines (51 loc) · 2.69 KB

SHIFT TTS

AFFECTIVE TTS using this phenomenon. Synthesize speech from .txt or .srt and overlay it to videos / picture.

Available Voices

Listen to available voices!

Install

virtualenv --python=python3 ~/.envs/.my_env
source ~/.envs/.my_env/bin/activate
cd shift/
pip install -r requirements.txt

Demo. Result saved as demo_affect.wav

CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=./hf_home CUDA_VISIBLE_DEVICES=0 python demo.py

API

Flask api.py on a tmux-session

CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=./hf_home CUDA_VISIBLE_DEVICES=0 python api.py

Inference

Examples below need api.py to be already running. If api.py runs on a different machine, use the IP shown in the terminal of api.py.

Text To Speech

# Basic TTS - See Available Voices Above - saves .wav in ./out
python tts.py --text assets/LLM_description.txt --voice "en_US/m-ailabs_low#mary_ann"

Listen to Various Generations

Native Voice to English (Affective) TTS

python tts.py --voice "en_US/m-ailabs_low#mary_ann"  --video assets/anbpr.webm --text assets/anbpr.en.srt

Native voice > TTS (en)

Native voice To Romanian TTS

python tts.py --voice romanian --video assets/anbpr.webm --text assets/anbpr.ro.srt

Native voice > TTS (ro)

Native Voice To English (Affective) TTS

python tts.py --voice "en_US/vctk_low#p306" --text assets/head_of_fortuna_en.srt --video assets/head_of_fortuna.mp4

Review demo SHIFT

Img To Speech

# Video narrating an image
python tts.py --text assets/LLM_description.txt --image assets/image_from_T31.jpg --voice "en_US/cmu-arctic_low#jmk"

Captions To Video