You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the solution you'd like
As of now, while the model is being loaded into memory, user queries may still reach a pod which may not be ready to answer (since the model is not loaded). The larger the model the larger the time to load it the larger the "downtime"
Anything else you would like to add:
This could be useful in different tasks like model/runtime canary rollout, replicas auto-scaling, etc
The text was updated successfully, but these errors were encountered:
bdattoma
changed the title
Check the model status before traffic being redirected to the pod
Check the model status before traffic gets redirected to the pods
Aug 22, 2023
I think this ticket should be moved to upstream kserve repo ( https://github.com/kserve/kserve/issues ).
We can keep this as "clone" to simplify the tracking but the solution should be implemented upstream
/kind feature
Describe the solution you'd like
As of now, while the model is being loaded into memory, user queries may still reach a pod which may not be ready to answer (since the model is not loaded). The larger the model the larger the time to load it the larger the "downtime"
Anything else you would like to add:
This could be useful in different tasks like model/runtime canary rollout, replicas auto-scaling, etc
The text was updated successfully, but these errors were encountered: