Skip to content

Commit

Permalink
Address comments
Browse files Browse the repository at this point in the history
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
  • Loading branch information
yuzisun committed Feb 17, 2023
1 parent c233487 commit bc3f27e
Showing 1 changed file with 11 additions and 12 deletions.
23 changes: 11 additions & 12 deletions docs/blog/articles/2023-02-05-KServe-0.10-release.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,19 @@ improving metrics for supported serving runtimes and increasing support coverage

## KServe Networking Options

Istio is now optional for both `Serverless` and `RawDeployment` mode, please see the [alternative networking guide](https://kserve.github.io/website/0.10/admin/serverless/kourier_networking/) to see how you can enable other ingress options supported by Knative with Serverless mode.
Istio is now optional for both `Serverless` and `RawDeployment` mode, please see the [alternative networking guide](https://kserve.github.io/website/0.10/admin/serverless/kourier_networking/) for how you can enable other ingress options supported by Knative with Serverless mode.
For Istio users, if you want to turn on full service mesh mode to secure inference services with mutual TLS and enable the traffic policies, please read the [service mesh setup guideline](https://kserve.github.io/website/0.10/admin/serverless/servicemesh/).

## KServe Telemetry for Serving Runtimes

Prometheus metrics are now available for supported serving runtimes including custom python runtimes,
in Serverless mode we have extended Knative queue-proxy to enable metrics aggregation for both metrics exposed in `queue-proxy` and `kserve-container`.
We have instrumented additional latency metrics in KServe python serving runtimes for `preprocess`, `predict` and `postprocess` handlers.
In Serverless mode we have extended Knative queue-proxy to enable metrics aggregation for both metrics exposed in `queue-proxy` and `kserve-container`.
Please read the [prometheus metrics setup guideline](https://kserve.github.io/website/0.10/modelserving/observability/prometheus_metrics/) for how to enable the metrics scraping and aggregations.

## Open(v2) Inference Protocol Support Coverage

In KServe 0.10, we have added support for Open(v2) inference protocol for KServe custom runtimes,
now you can enable v2 REST/gRPC for both custom transformer and predictor with images built by implementing KServe Python SDK API.
In KServe 0.10, we have added support for Open(v2) inference protocol for KServe custom runtimes.
Now, you can enable v2 REST/gRPC for both custom transformer and predictor with images built by implementing KServe Python SDK API.
gRPC enables high performance inference data plane as it is built on top of HTTP/2 and binary data transportation which is more efficient to send over the wire compared to REST.
Please see the detailed example for [transformer](https://kserve.github.io/website/0.10/modelserving/v1beta1/transformer/torchserve_image_transformer/) and
[predictor](https://kserve.github.io/website/0.10/modelserving/v1beta1/custom/custom_model/).
Expand Down Expand Up @@ -61,7 +61,7 @@ You can use the same Python API type `InferRequest` and `InferResponse` for both
New `headers` argument is added to the custom handlers to pass http/gRPC headers or metadata, you can also use this as context dict to pass data between handlers.


Please check the following table for complete support matrix for ServingRuntimes and ModelFormats.
Please check the following matrix for supported ServingRuntimes and ModelFormats.

| Model Format | v1 | v2 REST/gRPC |
| ------------------- |--------------| ----------------|
Expand All @@ -72,7 +72,7 @@ Please check the following table for complete support matrix for ServingRuntimes
| Scikit-learn | ✅ KServe | ✅ MLServer |
| XGBoost | ✅ KServe | ✅ MLServer |
| LightGBM | ✅ KServe | ✅ MLServer |
| MLFLow || ✅ MLServer |
| MLFlow || ✅ MLServer |
| Custom | ✅ KServe | ✅ KServe |


Expand All @@ -90,12 +90,11 @@ for multiple architectures: `ppc64le`, `arm64`, `amd64`, `s390x`.
- Support Azure Blobs with [managed identity](https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/how-manage-user-assigned-managed-identities?pivots=identity-mi-methods-azcli).

## ModelMesh updates
ModelMesh has continued to integrate itself as KServe's multi-model serving backend, introducing improvements and features that better align the two projects. For example, it now supports ClusterServingRuntimes, allowing use of cluster-scoped ServingRuntimes, originally introduced in KServe 0.8.

- Support for TorchServe ServingRuntime.
- Support v2 inference protocol for OpenVINO ServingRuntime.
- `ClusterServingRuntime` support for ModelMesh which now works in the same way as KServe.
- Support passing labels and annotations from `ServingRuntimePodSpec` to ModelMesh Pods.
- Support for `ImagePullSecrets` on ServingRuntime spec.
Additionally, ModelMesh has introduced support for TorchServe enabling users to serve arbitrary PyTorch models (e.g. eager-mode) in the context of distributed-multi-model serving.

Other limitations have been addressed as well, such as adding support for BYTES/string type tensors when using the REST inference API for inference requests that require them.


## Other Changes:
Expand Down

0 comments on commit bc3f27e

Please sign in to comment.