Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: disables features not needed for Istio deployments #172

Merged

Conversation

bartoszmajsak
Copy link

@bartoszmajsak bartoszmajsak commented Sep 11, 2023

Description

Brings back #96 (see description of it for more details).

Additional changes brought from (now closed) #184

  • changed the overlays set-up for kf-notebook-controller here to have an overlays/standalone-service-mesh used solely for e2e testing for notebook controllers, and use overlays/service-mesh for when deploying through ODH operator.
  • in addition we removed the overlays/service-mesh for the odh-notebook-controller as we will no longer need to patch the image once this is merged.

Fixes #168

How Has This Been Tested?

export K8S_NAMESPACE=opendatahub
kubectl create ns $K8S_NAMESPACE
make deploy-service-mesh
make deploy-with-mesh -e IMG=quay.io/maistra-dev/odh-notebook-controller -e TAG=v1.7-service-mesh
make e2e-test-service-mesh
make undeploy-with-mesh

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@bartoszmajsak
Copy link
Author

@VaishnaviHire @harshad16 I would very much appreciate a review and merge so I can test Service Mesh changes against the latest versions of the controllers.

@bartoszmajsak
Copy link
Author

bartoszmajsak commented Sep 28, 2023

The kustomize overlay provided in this PR is used for e2e tests only. We are working on porting our changes done for odh-manifests repo for operator v2 work and will provide them as a separate PR. See comment below #172 (comment)

@harshad16
Copy link
Member

Working on reviewing this PR.

@bartoszmajsak
Copy link
Author

bartoszmajsak commented Oct 5, 2023

The kustomize overlay provided in this PR is used for e2e tests only. We are working on porting our changes done for odh-manifests repo for operator v2 work and will provide them as a separate PR.

@harshad16 to simplify the review/merge @cam-garrison brought #184 changes here. All ready for review. I also updated PR description with some extra details.

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay in review.
While testing the deployment:
getting the mutate-webook error

make e2e-test-service-mesh -e K8S_NAMESPACE=opendatahub
go test ./e2e/ -run ^TestE2ENotebookController -v \
	--nb-namespace=opendatahub \
	--deploymentMode=service-mesh \
	"--skip-deletion=false" 
=== RUN   TestE2ENotebookController
=== RUN   TestE2ENotebookController/validate_controllers
I1017 01:57:15.802869   75749 request.go:601] Waited for 1.047326253s due to client-side throttling, not priority and fairness, request: GET:https://api.hnalla.dev.datahub.redhat.com:6443/apis/datasciencecluster.opendatahub.io/v1?timeout=32s
=== RUN   TestE2ENotebookController/validate_controllers/Validate_Kubeflow_notebook_controller
=== RUN   TestE2ENotebookController/validate_controllers/Validate_ODH_notebook_controller
=== RUN   TestE2ENotebookController/create
I1017 01:57:41.221572   75749 request.go:601] Waited for 1.046904354s due to client-side throttling, not priority and fairness, request: GET:https://api.hnalla.dev.datahub.redhat.com:6443/apis/extensions.istio.io/v1alpha1?timeout=32s
=== RUN   TestE2ENotebookController/create/thoth-minimal-service-mesh-notebook
=== RUN   TestE2ENotebookController/create/thoth-minimal-service-mesh-notebook/Creation_of_Notebook_instance
2023/10/17 01:58:04 Error creating Notebook resource thoth-minimal-service-mesh-notebook: Internal error occurred: failed calling webhook "notebooks.opendatahub.io": failed to call webhook: Post "https://odh-notebook-controller-webhook-service.opendatahub.svc:443/mutate-notebook-v1?timeout=10s": context deadline exceeded, trying again
2023/10/17 01:58:24 Error creating Notebook resource thoth-minimal-service-mesh-notebook: Internal error occurred: failed calling webhook "notebooks.opendatahub.io": failed to call webhook: Post "https://odh-notebook-controller-webhook-service.opendatahub.svc:443/mutate-notebook-v1?timeout=10s": context deadline exceeded, trying again

@harshad16
Copy link
Member

Seems like something similar was addressed here: #96 (comment)

@cam-garrison
Copy link

Apologies for the delay in review. While testing the deployment: getting the mutate-webook error

make e2e-test-service-mesh -e K8S_NAMESPACE=opendatahub
go test ./e2e/ -run ^TestE2ENotebookController -v \
	--nb-namespace=opendatahub \
	--deploymentMode=service-mesh \
	"--skip-deletion=false" 
=== RUN   TestE2ENotebookController
=== RUN   TestE2ENotebookController/validate_controllers
I1017 01:57:15.802869   75749 request.go:601] Waited for 1.047326253s due to client-side throttling, not priority and fairness, request: GET:https://api.hnalla.dev.datahub.redhat.com:6443/apis/datasciencecluster.opendatahub.io/v1?timeout=32s
=== RUN   TestE2ENotebookController/validate_controllers/Validate_Kubeflow_notebook_controller
=== RUN   TestE2ENotebookController/validate_controllers/Validate_ODH_notebook_controller
=== RUN   TestE2ENotebookController/create
I1017 01:57:41.221572   75749 request.go:601] Waited for 1.046904354s due to client-side throttling, not priority and fairness, request: GET:https://api.hnalla.dev.datahub.redhat.com:6443/apis/extensions.istio.io/v1alpha1?timeout=32s
=== RUN   TestE2ENotebookController/create/thoth-minimal-service-mesh-notebook
=== RUN   TestE2ENotebookController/create/thoth-minimal-service-mesh-notebook/Creation_of_Notebook_instance
2023/10/17 01:58:04 Error creating Notebook resource thoth-minimal-service-mesh-notebook: Internal error occurred: failed calling webhook "notebooks.opendatahub.io": failed to call webhook: Post "https://odh-notebook-controller-webhook-service.opendatahub.svc:443/mutate-notebook-v1?timeout=10s": context deadline exceeded, trying again
2023/10/17 01:58:24 Error creating Notebook resource thoth-minimal-service-mesh-notebook: Internal error occurred: failed calling webhook "notebooks.opendatahub.io": failed to call webhook: Post "https://odh-notebook-controller-webhook-service.opendatahub.svc:443/mutate-notebook-v1?timeout=10s": context deadline exceeded, trying again

The latest Service Mesh SMCP v2.3 (2.3.7) release has failures as it was released with an outdated image. This is likely the root of the issues you're facing since the install script installs v2.3 smcp. The fix for v2.3 SMCPs is coming ASAP, but as of now, newly created or updated SMCPs of this version are buggy.

I ran the tests again after changing the SMCP to be v2.4 and the tests succeeded. Please give that a try when you can.

@bartoszmajsak
Copy link
Author

@cam-garrison can we adjust the script to use 2.4 instead?

@bartoszmajsak
Copy link
Author

bartoszmajsak commented Oct 19, 2023

@harshad16 besides test/infra failures - is the code looking good for you?

@harshad16
Copy link
Member

yes, it does look good to me 💯

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

We just need to fix servicemesh issue with the webhook
which we would address slowly.
Thanks for the patience

@openshift-ci
Copy link

openshift-ci bot commented Oct 23, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: harshad16

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@shalberd
Copy link

/retest

@shalberd
Copy link

tests succeeded

@openshift-ci openshift-ci bot merged commit f0dbf5a into opendatahub-io:v1.7-branch Oct 24, 2023
@bartoszmajsak bartoszmajsak deleted the handle_mesh_for_nb branch October 24, 2023 12:11
@bartoszmajsak
Copy link
Author

Thanks for taking care of it @shalberd!

@shalberd
Copy link

shalberd commented Oct 24, 2023

Thanks goes to @harshad16 he did the heavy lifting. I just initiated retesting :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

v1.7 branch is missing Service Mesh logic for handling notebooks
4 participants