diff --git a/docs/blog/articles/2023-02-05-KServe-0.10-release.md b/docs/blog/articles/2023-02-05-KServe-0.10-release.md index bdc81b1d8..977315096 100644 --- a/docs/blog/articles/2023-02-05-KServe-0.10-release.md +++ b/docs/blog/articles/2023-02-05-KServe-0.10-release.md @@ -1,17 +1,17 @@ # Announcing: KServe v0.10.0 -We are excited to announce KServe 0.10 release, in this release we have been focusing on enabling more KServe networking setup options, -improving metrics for supported serving runtimes and increasing support coverage for [Open(aka v2) inference protocol](https://kserve.github.io/website/0.10/modelserving/data_plane/v2_protocol/) for both KServe and ModelMesh. +We are excited to announce KServe 0.10 release. In this release we have enabled more KServe networking options, +improved metrics instruments for supported serving runtimes and increased support coverage for [Open(aka v2) inference protocol](https://kserve.github.io/website/0.10/modelserving/data_plane/v2_protocol/) for both standard and ModelMesh InferenceService. ## KServe Networking Options -Istio is now optional for both `Serverless` and `RawDeployment` mode, please see the [alternative networking guide](https://kserve.github.io/website/0.10/admin/serverless/kourier_networking/) for how you can enable other ingress options supported by Knative with Serverless mode. -For Istio users, if you want to turn on full service mesh mode to secure inference services with mutual TLS and enable the traffic policies, please read the [service mesh setup guideline](https://kserve.github.io/website/0.10/admin/serverless/servicemesh/). +Istio is now optional for both `Serverless` and `RawDeployment` mode. Please see the [alternative networking guide](https://kserve.github.io/website/0.10/admin/serverless/kourier_networking/) for how you can enable other ingress options supported by Knative with Serverless mode. +For Istio users, if you want to turn on full service mesh mode to secure InferenceService with mutual TLS and enable the traffic policies, please read the [service mesh setup guideline](https://kserve.github.io/website/0.10/admin/serverless/servicemesh/). ## KServe Telemetry for Serving Runtimes -We have instrumented additional latency metrics in KServe python serving runtimes for `preprocess`, `predict` and `postprocess` handlers. -In Serverless mode we have extended Knative queue-proxy to enable metrics aggregation for both metrics exposed in `queue-proxy` and `kserve-container`. +We have instrumented additional latency metrics in KServe Python ServingRuntimes for `preprocess`, `predict` and `postprocess` handlers. +In Serverless mode we have extended Knative `queue-proxy` to enable metrics aggregation for both metrics exposed in `queue-proxy` and `kserve-container` from each `ServingRuntime`. Please read the [prometheus metrics setup guideline](https://kserve.github.io/website/0.10/modelserving/observability/prometheus_metrics/) for how to enable the metrics scraping and aggregations. ## Open(v2) Inference Protocol Support Coverage @@ -57,8 +57,11 @@ class CustomTransformer(Model): return infer_request ``` -You can use the same Python API type `InferRequest` and `InferResponse` for both REST and gRPC protocol, KServe handles the underlying decoding and encoding according to the protocol. -New `headers` argument is added to the custom handlers to pass http/gRPC headers or metadata, you can also use this as context dict to pass data between handlers. +You can use the same Python API type `InferRequest` and `InferResponse` for both REST and gRPC protocol. KServe handles the underlying decoding and encoding according to the protocol. + +!!! Warning + A new `headers` argument is added to the custom handlers to pass http/gRPC headers or other metadata. You can also use this as context dict to pass data between handlers. + If you have existing custom transformer or predictor, the `headers` argument is now required to add to the `preprocess`, `predict` and `postprocess` handlers. Please check the following matrix for supported ServingRuntimes and ModelFormats. @@ -92,7 +95,7 @@ for multiple architectures: `ppc64le`, `arm64`, `amd64`, `s390x`. ## ModelMesh updates ModelMesh has continued to integrate itself as KServe's multi-model serving backend, introducing improvements and features that better align the two projects. For example, it now supports ClusterServingRuntimes, allowing use of cluster-scoped ServingRuntimes, originally introduced in KServe 0.8. -Additionally, ModelMesh has introduced support for TorchServe enabling users to serve arbitrary PyTorch models (e.g. eager-mode) in the context of distributed-multi-model serving. +Additionally, ModelMesh introduced support for TorchServe enabling users to serve arbitrary PyTorch models (e.g. eager-mode) in the context of distributed-multi-model serving. Other limitations have been addressed as well, such as adding support for BYTES/string type tensors when using the REST inference API for inference requests that require them.