Skip to content

Commit

Permalink
Address comments
Browse files Browse the repository at this point in the history
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
  • Loading branch information
yuzisun committed Feb 17, 2023
1 parent c233487 commit ffedabb
Showing 1 changed file with 8 additions and 9 deletions.
17 changes: 8 additions & 9 deletions docs/blog/articles/2023-02-05-KServe-0.10-release.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,14 @@ For Istio users, if you want to turn on full service mesh mode to secure inferen

## KServe Telemetry for Serving Runtimes

Prometheus metrics are now available for supported serving runtimes including custom python runtimes,
in Serverless mode we have extended Knative queue-proxy to enable metrics aggregation for both metrics exposed in `queue-proxy` and `kserve-container`.
Prometheus metrics are now available for supported serving runtimes including custom python runtimes.
In Serverless mode we have extended Knative queue-proxy to enable metrics aggregation for both metrics exposed in `queue-proxy` and `kserve-container`.
Please read the [prometheus metrics setup guideline](https://kserve.github.io/website/0.10/modelserving/observability/prometheus_metrics/) for how to enable the metrics scraping and aggregations.

## Open(v2) Inference Protocol Support Coverage

In KServe 0.10, we have added support for Open(v2) inference protocol for KServe custom runtimes,
now you can enable v2 REST/gRPC for both custom transformer and predictor with images built by implementing KServe Python SDK API.
In KServe 0.10, we have added support for Open(v2) inference protocol for KServe custom runtimes.
Now, you can enable v2 REST/gRPC for both custom transformer and predictor with images built by implementing KServe Python SDK API.
gRPC enables high performance inference data plane as it is built on top of HTTP/2 and binary data transportation which is more efficient to send over the wire compared to REST.
Please see the detailed example for [transformer](https://kserve.github.io/website/0.10/modelserving/v1beta1/transformer/torchserve_image_transformer/) and
[predictor](https://kserve.github.io/website/0.10/modelserving/v1beta1/custom/custom_model/).
Expand Down Expand Up @@ -90,12 +90,11 @@ for multiple architectures: `ppc64le`, `arm64`, `amd64`, `s390x`.
- Support Azure Blobs with [managed identity](https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/how-manage-user-assigned-managed-identities?pivots=identity-mi-methods-azcli).

## ModelMesh updates
ModelMesh has continued to integrate itself as KServe's multi-model serving backend, introducing improvements and features that better align the two projects. For example, it now supports ClusterServingRuntimes, allowing use of cluster-scoped ServingRuntimes, originally introduced in KServe 0.8.

- Support for TorchServe ServingRuntime.
- Support v2 inference protocol for OpenVINO ServingRuntime.
- `ClusterServingRuntime` support for ModelMesh which now works in the same way as KServe.
- Support passing labels and annotations from `ServingRuntimePodSpec` to ModelMesh Pods.
- Support for `ImagePullSecrets` on ServingRuntime spec.
Additionally, ModelMesh has introduced support for TorchServe enabling users to serve arbitrary PyTorch models (e.g. eager-mode) in the context of distributed-multi-model serving.

Other limitations have been addressed as well, such as adding support for BYTES/string type tensors when using the REST inference API for inference requests that require them.


## Other Changes:
Expand Down

0 comments on commit ffedabb

Please sign in to comment.