Concurrent/Parallel Requests #427
-
Hi! I'm trying to create a project that uses silero for an AI phone assistant. I'm using a t4g.medium EC2 instance on which I run the VAD server. Does Silero support concurrent/multiple requests out of the box? I've observed degrading and slow performance with more than 2 concurrent calls. I saw your parallelization example. My implementation is slightly different because I setup VAD with a flask server and user gunicorn to spin up multiple service workers but that didn't really help Is there a way you suggest to handle this use case? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Since the VAD is not stateless, each audio stream (thread / process) should have its own separate instance of VAD that has its states reset each time the stream ends. With a web-server this can be done via re-creating a VAD instance each time if time allows, or by using some publisher-consumer architecture, where there is one or several workers, each worker consuming a separate audio stream. Batching is possible, but I would not recommend doing that in production, this is somewhat complicated and error-prone. |
Beta Was this translation helpful? Give feedback.
Since the VAD is not stateless, each audio stream (thread / process) should have its own separate instance of VAD that has its states reset each time the stream ends.
With a web-server this can be done via re-creating a VAD instance each time if time allows, or by using some publisher-consumer architecture, where there is one or several workers, each worker consuming a separate audio stream.
Batching is possible, but I would not recommend doing that in production, this is somewhat complicated and error-prone.