Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sync] kserve/modelmesh-serving-v0.11.0-rc0 to main branch #107

Merged
merged 19 commits into from
Jun 9, 2023

Conversation

Jooho
Copy link

@Jooho Jooho commented Jun 6, 2023

Motivation

This is Release Sync. Detail information is here:

Test

go get github.com/onsi/ginkgo/v2
go install github.com/onsi/ginkgo/v2/ginkgo
export PATH=${GOBIN}:${PATH}

TAG=fast MM_USER=Jooho BRANCH=20230605_sync_main REPO_URI=remote CONTROLLERNAMESPACE=opendatahub NAMESPACE=modelmesh-serving FORCE=true NAMESPACESCOPEMODE=true make  deploy-mm-for-odh  deploy-fvt-for-odh repeat-fvt 

Result

....
Ran 5 of 5 Specs in 183.129 seconds
SUCCESS! -- 5 Passed | 0 Failed | 0 Pending | 0 Skipped

Ginkgo ran 1 suite in 3m5.144013328s
Test Suite Passed
Passed fvt/hpa. Move on the next test
[SUCCESS] FVT Test Passed!

PR checklist

Checklist items below are applicable for development targeted to both fast and stable branches/tags

  • Unit tests pass locally
  • FVT tests pass locally
  • If the PR adds a new container image or updates the tag of an existing image (not build within cpaas), is the corresponding change made in live-builder and cpaas-midstream to add/update the image tag in the operator CSV? Link the PRs if applicable

Checklist items below are applicable for development targeted to both fast and stable branches/tags

  • Tested modelmesh serving deployment with odh-manifests and ran odh-manifests-e2e tests locally

ddelange and others added 15 commits May 5, 2023 11:11
Multi-platform build for `modelmesh-controller` and `modelmesh-controller-develop`;
enable minikube deployments on Mac M1 laptops with ARM chip (opendatahub-io#162, opendatahub-io#231)

- Remove `amd64` arch-specific `nodeAffinity` for modelmesh controller (opendatahub-io#162, opendatahub-io#231)
- Build for platforms `linux/amd64`, `linux/arm64`, `linux/ppc64le`, `linux/s390x`
- Use `ubi8/go-toolset:1.18` for developer image instead of `ubi8/ubi-minimal:8.7`
- Build on `pull_request` as well as on `push` to catch build breaks during PR review
- Push developer images to DockerHub on PR merge, use local registry for PR builds
- Update deprecated `checkout` action
- Fix infinite docker inside docker error when running `make run fmt` inside dev container
- Update `build_devimage.sh` script to only pull developer image if not already present
- Update `develop.sh` script to use the `.develop_image_name`

Resolves opendatahub-io#162
Resolves opendatahub-io#231

---------

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>
Signed-off-by: Christian Kadner <ckadner@us.ibm.com>

Co-authored-by: Christian Kadner <ckadner@us.ibm.com>
Signed-off-by: Christian Kadner <ckadner@us.ibm.com>
Remove the `--enable-self-signed-ca` option from the quickstart guide
and webhook information. The quickstart instructions explicitly install the
latest release `v0.10` which does not include the recently added
`--enable-self-signed-ca`. Following the document as-is results in an
`Unknown option` error.

Resolves kserve#370

Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com>
Fix "no space left on device" error for FVTs on GitHub action runners
by deleting libraries that are not needed for dotnet, Android, PowerShell,
Swift which increases available disk space by 25 GB plus an additional
reclaimed space of 2.6 GB after pruning docker images.

Resolves kserve#367

Signed-off-by: Christian Kadner <ckadner@us.ibm.com>
Add `arm64` to list of supported architectures for Node affinity
to enable Quickstart deployments on MacBooks with M1 chips.
Currently only `amd64` and `arm64` are supported by the
ModelMesh runtime adapter.

Signed-off-by: Christian Kadner <ckadner@us.ibm.com>
Update TorchServe image version in the default serving runtime from 0.6.0 to 0.7.1

Closes kserve#360

Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com>
Update MLServer image for the default serving runtime from 0.5.2 to 1.3.2

Depends on kserve/modelmesh-runtime-adapter#45

Closes kserve#357

Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com>
#### Motivation
To port a previously-built Grafana dashboard developed for displaying various prometheus metrics for ModelMesh and build upon it to include metrics at the deployment/runtime-view

#### Modifications
- Added `config/dashboard/` directory to host grafana dashboard JSON
- Added `servicemonitor.yaml` to `config/prometheus/`
- Added and modified `ModelMeshMetricsDashboard.json` to work out-of-the-box 
  - Added queries for deployment views to filter by serving runtime
  - Created sections for global-view and deployment-view visualizations
  - Created drop-down variables for filtering between one or many pre-existing runtimes with option to filter by custom serving runtime
- Updated `monitoring.md` doc to include guide for setting up Prometheus/Grafana for monitoring; mainly using instructions from [modelmesh-performance](https://github.com/kserve/modelmesh-performance/tree/main/docs/monitoring)

#### Result
- A functional, ready-to-use dashboard JSON
- General documentation outlining out to set up monitoring and use the JSON
- Closes kserve#335 

Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com>
Update TritonServer image version from 21.06.1 to 23.04

Closes kserve#358

Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com>
Update KServe and dependencies in preparation for v0.11.0 release:
- Update go.mod
- Update CRDs under config/crd/bases
- Remove outdated module replacements
- Update Dockerfile.develop:
  - Use Go 1.19
  - Fix broken controller-gen install
- Add safe.directory work-around for the git 'dubious ownership'
  error during the GHA `lint` workflow
- Temporarily disable lint deprecation check "SA1019"
- Update Copyright header formatting in *.go sources to not be
  mistaken as package documentation
- Update mock client GET function signature in grpc_resolver_test.go

Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com>
Signed-off-by: Christian Kadner <ckadner@us.ibm.com>
Co-authored-by: Christian Kadner <ckadner@us.ibm.com>
#### Motivation

When using the `storageURI` form of an HTTP (non Azure blob) address to download a model, the `modelPath` needs to be an non-empty string. Before this change, the `storageURI: http://models.r.us/my-model.json` form would be equivalent to the following `storage` spec:
```
storage:
  type: http
  path: ''    # this being empty is problematic for later processing
  parameters:
    url: http://models.r.us/path/to/my-model.json      
```

The http storage type is currently the only way to have a valid storage configuration with an empty `path` (mainly because it has a "url" parameter that could include the full path). That said, I'm not sure if we should make a `path` required for the HTTP storage type. In particular, if the `url` is just `http://models.r.us/`, there is no path portion.

Related: kserve/modelmesh-runtime-adapter#41 (comment)

#### Modifications

Set `modelPath` to the URL's Path and set the `url` parameter to not have the URL Path.

#### Result

With these changes, the `storageURI` example above changes to have a `path` field:

```
storage:
  type: http
  path: path/to/my-model.json
  parameters:
    url: http://models.r.us/
```

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Christian Kadner <ckadner@us.ibm.com>

Co-authored-by: Christian Kadner <ckadner@us.ibm.com>
Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com>
Signed-off-by: jooho <jlee@redhat.com>
Signed-off-by: jooho <jlee@redhat.com>
Signed-off-by: jooho <jlee@redhat.com>
@openshift-ci openshift-ci bot requested review from anishasthana and heyselbi June 6, 2023 18:32
@openshift-ci openshift-ci bot added the approved label Jun 6, 2023
Signed-off-by: jooho <jlee@redhat.com>
@Jooho Jooho force-pushed the 20230605_sync_main branch 5 times, most recently from fe5db8e to a16bbe3 Compare June 7, 2023 00:43
&& Update runtime/fvt manifests

Signed-off-by: jooho <jlee@redhat.com>
@Jooho Jooho force-pushed the 20230605_sync_main branch from a16bbe3 to 72402f2 Compare June 7, 2023 01:08
@Jooho
Copy link
Author

Jooho commented Jun 8, 2023

/retest

Signed-off-by: jooho <jlee@redhat.com>
Signed-off-by: Jooho Lee <jlee@redhat.com>
@openshift-ci
Copy link

openshift-ci bot commented Jun 8, 2023

@Jooho: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/fvt 97c2046 link true /test fvt

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci
Copy link

openshift-ci bot commented Jun 9, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danielezonca, Jooho

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@danielezonca
Copy link

/lgtm

@openshift-merge-robot openshift-merge-robot merged commit 8a2d905 into opendatahub-io:main Jun 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants