text-to-audio-latent-diffusion
-
Updated
Aug 25, 2023 - Python
text-to-audio-latent-diffusion
Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.
An audiobook sound effect generator that transforms SRT files into immersive audio experiences. It parses SRT files, uses ChatGPT to create sound effect prompts, generates sounds via the ElevenLabs API, and syncs the audio on an MP3 timeline.
AI Audio Framework 🎵
This project demonstrates real-time audio processing using Python. It captures audio from a microphone, converts the speech to text, and then synthesizes the text back to speech using a different voice. This can be useful for applications such as voice changers, real-time translation, and more.
Add a description, image, and links to the ai-audio-generation topic page so that developers can more easily learn about it.
To associate your repository with the ai-audio-generation topic, visit your repo's landing page and select "manage topics."