Check the model status before traffic gets redirected to the pods #66

bdattoma · 2023-08-22T16:01:15Z

/kind feature

Describe the solution you'd like
As of now, while the model is being loaded into memory, user queries may still reach a pod which may not be ready to answer (since the model is not loaded). The larger the model the larger the time to load it the larger the "downtime"

Anything else you would like to add:
This could be useful in different tasks like model/runtime canary rollout, replicas auto-scaling, etc

danielezonca · 2023-08-30T09:09:49Z

I think this ticket should be moved to upstream kserve repo ( https://github.com/kserve/kserve/issues ).
We can keep this as "clone" to simplify the tracking but the solution should be implemented upstream

bdattoma · 2023-09-07T13:17:06Z

upstream issue: kserve#3113

bdattoma · 2023-09-18T07:50:31Z

A such feature is already implemented upstream: https://kserve.github.io/website/0.10/modelserving/data_plane/v2_protocol/#healthreadinessliveness-probes

openshift-ci bot added the kind/feature New feature label Aug 22, 2023

bdattoma changed the title ~~Check the model status before traffic being redirected to the pod~~ Check the model status before traffic gets redirected to the pods Aug 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check the model status before traffic gets redirected to the pods #66

Check the model status before traffic gets redirected to the pods #66

bdattoma commented Aug 22, 2023

danielezonca commented Aug 30, 2023

bdattoma commented Sep 7, 2023

bdattoma commented Sep 18, 2023

Check the model status before traffic gets redirected to the pods #66

Check the model status before traffic gets redirected to the pods #66

Comments

bdattoma commented Aug 22, 2023

danielezonca commented Aug 30, 2023

bdattoma commented Sep 7, 2023

bdattoma commented Sep 18, 2023