GitHub - audeering/shift: Contribution to https://shift-europe.eu

AFFECTIVE TTS using this phenomenon. Synthesize speech from .txt or .srt and overlay it to videos / picture.

Has 134 affective voices for English tuned for StyleTTS2. Supports single-voice foreign languages TTS via MMS.
A Beta Version of this tool for TTS & audio soundscape is build here

Install

virtualenv --python=python3 ~/.envs/.my_env
source ~/.envs/.my_env/bin/activate
cd shift/
pip install -r requirements.txt

Demo. Result saved as demo_affect.wav

CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=./hf_home CUDA_VISIBLE_DEVICES=0 python demo.py

Flask api.py on a tmux-session

CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=./hf_home CUDA_VISIBLE_DEVICES=0 python api.py

Examples below need api.py to be already running. If api.py runs on a different machine, use the IP shown in the terminal of api.py.

Text To Speech

# Basic TTS - See Available Voices Above - saves .wav in ./out
python tts.py --text assets/LLM_description.txt --voice "en_US/m-ailabs_low#mary_ann"

Native Voice to English (Affective) TTS

python tts.py --voice "en_US/m-ailabs_low#mary_ann"  --video assets/anbpr.webm --text assets/anbpr.en.srt

Native voice To Romanian TTS

python tts.py --voice romanian --video assets/anbpr.webm --text assets/anbpr.ro.srt

Native Voice To English (Affective) TTS

python tts.py --voice "en_US/vctk_low#p306" --text assets/head_of_fortuna_en.srt --video assets/head_of_fortuna.mp4

Img To Speech

# Video narrating an image
python tts.py --text assets/LLM_description.txt --image assets/image_from_T31.jpg --voice "en_US/cmu-arctic_low#jmk"

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
Modules		Modules
Utils		Utils
assets		assets
.gitignore		.gitignore
README.md		README.md
api.py		api.py
demo.py		demo.py
index.html		index.html
models.py		models.py
msinference.py		msinference.py
requirements.txt		requirements.txt
text_utils.py		text_utils.py
tts.py		tts.py
voices.json		voices.json