Concurrent/Parallel Requests #427

omerzulfiqar · 2024-03-05T03:38:27Z

omerzulfiqar
Mar 5, 2024

Hi!

I'm trying to create a project that uses silero for an AI phone assistant. I'm using a t4g.medium EC2 instance on which I run the VAD server. Does Silero support concurrent/multiple requests out of the box? I've observed degrading and slow performance with more than 2 concurrent calls. I saw your parallelization example. My implementation is slightly different because I setup VAD with a flask server and user gunicorn to spin up multiple service workers but that didn't really help Is there a way you suggest to handle this use case? Thanks!

Answered by snakers4

Mar 5, 2024

Does Silero support concurrent/multiple requests out of the box?

Since the VAD is not stateless, each audio stream (thread / process) should have its own separate instance of VAD that has its states reset each time the stream ends.

With a web-server this can be done via re-creating a VAD instance each time if time allows, or by using some publisher-consumer architecture, where there is one or several workers, each worker consuming a separate audio stream.

Batching is possible, but I would not recommend doing that in production, this is somewhat complicated and error-prone.

View full answer

snakers4 · 2024-03-05T04:05:37Z

snakers4
Mar 5, 2024
Maintainer

Does Silero support concurrent/multiple requests out of the box?

Since the VAD is not stateless, each audio stream (thread / process) should have its own separate instance of VAD that has its states reset each time the stream ends.

With a web-server this can be done via re-creating a VAD instance each time if time allows, or by using some publisher-consumer architecture, where there is one or several workers, each worker consuming a separate audio stream.

Batching is possible, but I would not recommend doing that in production, this is somewhat complicated and error-prone.

2 replies

omerzulfiqar Mar 5, 2024
Author

Gotcha. When you say " has its states reset each time the stream ends." what do you mean by that? Also we're not sending an audio stream. We're sending in audio chunks instead. Would an audio stream sent over a websocket connection be more ideal? @snakers4

snakers4 Mar 5, 2024
Maintainer

The model has a "reset_states()" method for PyTorch. The ONNX wrapper has a more explicit code that does the same.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrent/Parallel Requests #427

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Concurrent/Parallel Requests #427

omerzulfiqar Mar 5, 2024

Replies: 1 comment · 2 replies

snakers4 Mar 5, 2024 Maintainer

omerzulfiqar Mar 5, 2024 Author

snakers4 Mar 5, 2024 Maintainer

omerzulfiqar
Mar 5, 2024

Replies: 1 comment 2 replies

snakers4
Mar 5, 2024
Maintainer

omerzulfiqar Mar 5, 2024
Author

snakers4 Mar 5, 2024
Maintainer