Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

16000Hz? #176

Open
tim-gromeyer opened this issue Aug 13, 2023 · 7 comments
Open

16000Hz? #176

tim-gromeyer opened this issue Aug 13, 2023 · 7 comments

Comments

@tim-gromeyer
Copy link

Hello,
is it possible to use 16000Hz? In the README it says only 48000Hz, but I can't use 48000Hz. I am trying to add noise cancellation to my voice assistant, but Vosk (the library used for speech recognition) does not work with 48000Hz, unfortunately.

@zamadatix
Copy link

The RNN is trained on being fed 48 kHz signals. One option for your voice assistant would be to capture audio at 48000 Hz, feed it through the noise cancellation algorithm, then downsample the resulting audio to 16 kHz to be processed by Vosk.

@xJanise
Copy link

xJanise commented Aug 30, 2023

the microphone i personally am using doesnt support anything above 16kHz, is there any workaround or will have have to use it with 16kHz and hope for the best?

@zamadatix
Copy link

zamadatix commented Aug 30, 2023

It doesn't as much matter that the original data is 16 kHz as much as the format fed into the noise suppression algorithm is encoded as 48 kHz sampling. From that perspective you can still grab from the mic and feed to Vosk at 16 kHz you just need to do sample rate conversion (i.e. convert from 16 kHz to 48 kHz or vice versa) for audio coming into or going out of the noise suppression step.

@LigangSun
Copy link

I was thinking that I just need to fill the 480 buffer with any sample rate audio, but It seems that it doesn't work.

@zamadatix
Copy link

Of course not, if it didn't matter then the program wouldn't specify to put 48000 Hz audio into the buffer. You need to resample to 48000 Hz and then put that resampled data in the buffer.

@LigangSun
Copy link

Of course not, if it didn't matter then the program wouldn't specify to put 48000 Hz audio into the buffer. You need to resample to 48000 Hz and then put that resampled data in the buffer.

Hello

Thanks for your reply.

I just tried to resample the 16000 Hz audio to 48000 Hz, after denoise the data, resample the audio back. But I got very back audio quality.

Any suggestion on this issue

Best wishes

@zamadatix
Copy link

Do you have a sample file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants