diff --git a/docs/modelserving/autoscaling/autoscaling.md b/docs/modelserving/autoscaling/autoscaling.md index e8acea410..e44a0ba3f 100644 --- a/docs/modelserving/autoscaling/autoscaling.md +++ b/docs/modelserving/autoscaling/autoscaling.md @@ -505,7 +505,7 @@ KServe supports `RawDeployment` mode to enable `InferenceService` deployment wit When using Kserve with the `RawDeployment` mode, Knative is not installed. In this mode, if you deploy an `InferenceService`, Kserve uses **Kubernetes’ Horizontal Pod Autoscaler (HPA)** for autoscaling instead of **Knative Pod Autoscaler (KPA)**. For more information about Kserve's autoscaler, you can refer [`this`](https://kserve.github.io/website/master/modelserving/v1beta1/torchserve/#knative-autoscaler) -=== "Old Schema" +=== "New Schema" ```yaml apiVersion: "serving.kserve.io/v1beta1" @@ -519,11 +519,13 @@ When using Kserve with the `RawDeployment` mode, Knative is not installed. In th serving.kserve.io/targetUtilizationPercentage: "80" spec: predictor: - sklearn: + model: + modelFormat: + name: sklearn storageUri: "gs://kfserving-examples/models/sklearn/1.0/model" ``` -=== "New Schema" +=== "Old Schema" ```yaml apiVersion: "serving.kserve.io/v1beta1" @@ -537,9 +539,7 @@ When using Kserve with the `RawDeployment` mode, Knative is not installed. In th serving.kserve.io/targetUtilizationPercentage: "80" spec: predictor: - model: - modelFormat: - name: sklearn + sklearn: storageUri: "gs://kfserving-examples/models/sklearn/1.0/model" ``` @@ -547,7 +547,7 @@ When using Kserve with the `RawDeployment` mode, Knative is not installed. In th If you want to control the scaling of the deployment created by KServe inference service with an external tool like [`KEDA`](https://keda.sh/). You can disable KServe's creation of the **HPA** by replacing **external** value with autoscaler class annotaion that should be disable the creation of HPA -=== "Old Schema" +=== "New Schema" ```yaml apiVersion: "serving.kserve.io/v1beta1" @@ -559,11 +559,13 @@ If you want to control the scaling of the deployment created by KServe inference name: "sklearn-iris" spec: predictor: - sklearn: + model: + modelFormat: + name: sklearn storageUri: "gs://kfserving-examples/models/sklearn/1.0/model" ``` -=== "New Schema" +=== "Old Schema" ```yaml apiVersion: "serving.kserve.io/v1beta1" @@ -575,8 +577,6 @@ If you want to control the scaling of the deployment created by KServe inference name: "sklearn-iris" spec: predictor: - model: - modelFormat: - name: sklearn + sklearn: storageUri: "gs://kfserving-examples/models/sklearn/1.0/model" ```