From ffedabb41f37fba6b896353d300854bdcd234ae0 Mon Sep 17 00:00:00 2001 From: Dan Sun Date: Fri, 17 Feb 2023 03:44:29 -0500 Subject: [PATCH] Address comments Signed-off-by: Dan Sun --- .../articles/2023-02-05-KServe-0.10-release.md | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/docs/blog/articles/2023-02-05-KServe-0.10-release.md b/docs/blog/articles/2023-02-05-KServe-0.10-release.md index 4bbdc996e..594fe7eb5 100644 --- a/docs/blog/articles/2023-02-05-KServe-0.10-release.md +++ b/docs/blog/articles/2023-02-05-KServe-0.10-release.md @@ -10,14 +10,14 @@ For Istio users, if you want to turn on full service mesh mode to secure inferen ## KServe Telemetry for Serving Runtimes -Prometheus metrics are now available for supported serving runtimes including custom python runtimes, -in Serverless mode we have extended Knative queue-proxy to enable metrics aggregation for both metrics exposed in `queue-proxy` and `kserve-container`. +Prometheus metrics are now available for supported serving runtimes including custom python runtimes. +In Serverless mode we have extended Knative queue-proxy to enable metrics aggregation for both metrics exposed in `queue-proxy` and `kserve-container`. Please read the [prometheus metrics setup guideline](https://kserve.github.io/website/0.10/modelserving/observability/prometheus_metrics/) for how to enable the metrics scraping and aggregations. ## Open(v2) Inference Protocol Support Coverage -In KServe 0.10, we have added support for Open(v2) inference protocol for KServe custom runtimes, -now you can enable v2 REST/gRPC for both custom transformer and predictor with images built by implementing KServe Python SDK API. +In KServe 0.10, we have added support for Open(v2) inference protocol for KServe custom runtimes. +Now, you can enable v2 REST/gRPC for both custom transformer and predictor with images built by implementing KServe Python SDK API. gRPC enables high performance inference data plane as it is built on top of HTTP/2 and binary data transportation which is more efficient to send over the wire compared to REST. Please see the detailed example for [transformer](https://kserve.github.io/website/0.10/modelserving/v1beta1/transformer/torchserve_image_transformer/) and [predictor](https://kserve.github.io/website/0.10/modelserving/v1beta1/custom/custom_model/). @@ -90,12 +90,11 @@ for multiple architectures: `ppc64le`, `arm64`, `amd64`, `s390x`. - Support Azure Blobs with [managed identity](https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/how-manage-user-assigned-managed-identities?pivots=identity-mi-methods-azcli). ## ModelMesh updates +ModelMesh has continued to integrate itself as KServe's multi-model serving backend, introducing improvements and features that better align the two projects. For example, it now supports ClusterServingRuntimes, allowing use of cluster-scoped ServingRuntimes, originally introduced in KServe 0.8. -- Support for TorchServe ServingRuntime. -- Support v2 inference protocol for OpenVINO ServingRuntime. -- `ClusterServingRuntime` support for ModelMesh which now works in the same way as KServe. -- Support passing labels and annotations from `ServingRuntimePodSpec` to ModelMesh Pods. -- Support for `ImagePullSecrets` on ServingRuntime spec. +Additionally, ModelMesh has introduced support for TorchServe enabling users to serve arbitrary PyTorch models (e.g. eager-mode) in the context of distributed-multi-model serving. + +Other limitations have been addressed as well, such as adding support for BYTES/string type tensors when using the REST inference API for inference requests that require them. ## Other Changes: