-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add logic on nbc-webhook to pickup internal or external registry #329
Add logic on nbc-webhook to pickup internal or external registry #329
Conversation
components/odh-notebook-controller/controllers/notebook_webhook.go
Outdated
Show resolved
Hide resolved
components/odh-notebook-controller/controllers/notebook_webhook.go
Outdated
Show resolved
Hide resolved
components/odh-notebook-controller/controllers/notebook_webhook.go
Outdated
Show resolved
Hide resolved
components/odh-notebook-controller/controllers/notebook_webhook.go
Outdated
Show resolved
Hide resolved
components/odh-notebook-controller/controllers/notebook_webhook.go
Outdated
Show resolved
Hide resolved
|
||
annotations := notebook.GetAnnotations() | ||
if annotations != nil { | ||
if imageSelection, exists := annotations["notebooks.opendatahub.io/last-image-selection"]; exists { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you check this? I can not see this on vs code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code coverage should be part of the go vscode plugin, https://github.com/golang/vscode-go/blob/master/docs%2Ffeatures.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rsc in his talk https://research.swtch.com/testing uses
go tool cover -html
that visualizes the coverage data as HTML
components/odh-notebook-controller/controllers/notebook_webhook.go
Outdated
Show resolved
Hide resolved
components/odh-notebook-controller/controllers/notebook_webhook.go
Outdated
Show resolved
Hide resolved
… that include the imagestream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Great approach , just one question:
In disconnected cluster, can there be a case
where there is a existing notebook in the cluster ?
Thank you, Harshad, for your review. 🙂 This approach doesn't affect already created notebooks; it only applies during their creation. In the disconnected cluster I used for testing, I didn't notice anything unusual with existing notebooks. However, since the cluster was shared, I couldn't perform extensive testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested in cluster, works for a new workbenches.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: harshad16 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
660c3ac
into
opendatahub-io:v1.7-branch
/cherrypick stable |
@harshad16: new pull request created: #332 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@harshad16 @atheo89 ah, I see, you are not using the image change trigger annotation here in odh notebook controller. I will have a look at it more today and tomorrow.
imagestream tag spec:
|
general thought: I think you are really replicating the task of the image content change annotation image.openshift.io/triggers with paused: true here. I get the reasoning, in that your sprint task is to just "make it work without internal openshift docker registry". What I am missing is a kind of architectural discussion, though your work in the PR is impressive. |
@atheo89 regarding working with @lucferbux on "When the internal registry is down, the dashboard displays a red !Deleted label (and no package info is available)": Also, I'd suggest not updating the JUPYTER_IMAGE env var to i.e. myregistry.com/bla/image@sha256:3d92929292 kind of digest notation. It was myregistryinternalopenshift:5000/imagestreamname:imagestreamtag before, I know. Instead, just put imagestreamname:imagestreamtag in there, it is fine, checked with upstream. |
I think it'd help if we were to arrange for a 30-45 minute Zoom call some time in the morning US time, in the afternoon or evening European time. |
}) | ||
imageHash := items[0].(map[string]interface{})["dockerImageReference"].(string) | ||
notebook.Spec.Template.Spec.Containers[0].Image = imageHash | ||
// Update the JUPYTER_IMAGE environment variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi all, definitely, putting an sha256 style name in for env var JUPYTER_IMAGE, especially for an image (docker.from.name origin), not an imagestream, is not ok.
What I can say for sure is, that, during my work on PR-800, I found out that JUPYTER_IMAGE env var is used during commercial RHOIA / upstream CI testing
https://github.com/red-hat-data-services/ods-ci/blob/master/ods_ci/tests/Resources/Page/ODH/JupyterHub/JupyterHubSpawner.robot#L374
to determine imagestream name and tag.
Meaning I checked with someone back then and also with @VaishnaviHire that it is ok to newly put in imagestreamname:imagestreamtag into that env var, as I did in my changes to dashboard PR-800.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, as I mentioned before, ODH notebook controller is the wrong place to handle this sort of stuff.
Much better in my view to start declaratively with odh dashboard and the image change trigger annotation.
There are ways to handle long-running notebooks, avoid container restarts, even with that, but we have to put in a slider or checkmark in dashboard GUI to make notebook long-running, set annotation field to paused: true after initial startup.
@lucferbux @harshad16 about long-running notebooks and avoiding changes to the image-field when the underlying digest of an imagestream tag / docker.image.from change: The way to do this is: setting paused: true right away does not work, understandably, the image-field of the container is not updated.
first, set paused: false
image field is resolved, we're letting image change admission plugin resolve the name of the container image
now, as mentioned, if we want to avoid any pod restarts to to image digest changes behind a floating tag, for long-running-notebooks, we need to enable in odh dashboard a GUI slider that sets |
log.Info("Imagestream not found in any of the specified namespaces", "imagestreamName", imagestreamName, "tag", tag) | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice from us to provide some log/warning for the else branches in these 2 cases (ifs on lines 473 and 474).
if annotations != nil { | ||
if imageSelection, exists := annotations["notebooks.opendatahub.io/last-image-selection"]; exists { | ||
// Check if the image selection has an internal registry, if so will pickup this. This value constructed on the initialization of the Notebook CR. | ||
if strings.Contains(notebook.Spec.Template.Spec.Containers[0].Image, "image-registry.openshift-image-registry.svc:5000") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure the container we're interested in is first always in all cases? Wouldn't it be better to check based on the container name which should match name of the Notebook? The other container - oauth-proxy
- contains direct quay link from the beginning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is that way the kind: Notebooks with their podspec are assembled by odh dashboard, i.e. that the notebook container is always the first one, index 0. But it is not guaranteed.
Here, also, as you can see in my comments elsewhere, the admission plugin works with fieldPath-based lookups of the container by container name, which is a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The other container - oauth-proxy - contains direct quay link from the beginning."
Correct, that is done by odh notebook controller. It is part of an oauth-argument in the odh-notebook-controller-manager pod container and evaluated by odh notebook controller.
if annotations != nil { | ||
if imageSelection, exists := annotations["notebooks.opendatahub.io/last-image-selection"]; exists { | ||
// Check if the image selection has an internal registry, if so will pickup this. This value constructed on the initialization of the Notebook CR. | ||
if strings.Contains(notebook.Spec.Template.Spec.Containers[0].Image, "image-registry.openshift-image-registry.svc:5000") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question - are we sure that the value for the internal registry will always be this way? Is there a possibility that there could be some custom value?
Why don't we want to check presence of internal registry via ImageStream status.dockerImageRepository
value as empty (no internal registry enabled) vs non-empty (internal registry present)? This seem to me as a more proper approach than to check with what the Notebook CR was created 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, that is what the image content change trigger admission plugin, which we are not using up to now (part of PR 800) is doing on the openshift side as well, openshift-internally via golang.
Related to: https://issues.redhat.com/browse/RHOAIENG-6617
Description
This PR implements a logic on the odh-notebook-webhook controller, that checks if there is an internal registry and takes the corresponding actions to set the container.image value. If an internal registry is detected, it uses the default values specified in the Notebook Custom Resource (CR). Otherwise, it checks the last-image-selection annotation to find the image stream and fetches the image from status.dockerImageReference, assigning it to the container.image value.
How Has This Been Tested?
It tested in a cluster with & without internal registry.
Replace the deployment image of the odh-notebook-controller with
quay.io/opendatahub/odh-notebook-controller:pr-329
Spin up a new notebook should be started as usual.
Internal registry found. Will pickup the default value from image field.
To test this without internal registry, you need to update the Registry CR spec with
spec.managementState
fromManaged
toRemoved
(For more look here)No Internal registry found, pick up imageHash from status.tag.dockerImageReference
Important Note: When the internal registry is down, the dashboard displays a red
!Deleted
label. For this we are working with @lucferbux to identify and fix it.Merge criteria: