-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upgrade: cleanup CreateWithRetry usage #1145
upgrade: cleanup CreateWithRetry usage #1145
Conversation
Skipping CI for Draft Pull Request. |
36e1e4f
to
4de8683
Compare
/onhold |
pkg/upgrade/upgrade.go
Outdated
return err | ||
} | ||
|
||
switch { | ||
case len(instances.Items) > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but by removing this part, when there is no webhook, we are actually able to create 2nd instance, isnt it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i understand you move the logic "already one exist" and "not any exists yet" into CreateWithRetry()
but not really cover the case "already one exist but now i wanna create another one with a different name" if webhook (say, downstream) is not in place. so this code wont be able to sync to downstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apart from this ^ , i am fine with the change in CreateWithRetry()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are totally right, I mentioned it in the commit message BTW.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took back already exists logic for DSCI. Did not introduce it for DSC. Removed the message from CreateWithRetry().
return fmt.Errorf("failed to create DataScienceCluster custom resource: %w", err) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on line 122: If there exists an instance already, it patches the DSCISpec with default values
can we first clarify what we want: if exist then patch back to default or leave it as-is.
from code (even before this PR) looks like it is not gonna patch to default, because if one DSCI CR exist, then dont do anything regardless what is the name or if the spec is different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great eye! I missed the comment. In the original patch 1ccbe05 ("Unset Tech Preview components by default (#708)") it patched with the default values. But then f756e40 ("Update incubation with downstream changes (#783)") changes it with the message I borrowed without changing the comment.
(Sorry, but due to squashing it's not obvious where the change came from, I could dig it to a4788f3 ("fix(mm-monitoring): revert the code logic but set to disable as delete (#153)") but it squashed again and has only line Retain existing DSCI values
with no description or commit reference around.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting. i will need to have a follow-up on this. probably talk with @VaishnaviHire when she is back.
4de8683
to
c0540b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/retest-required |
Intention of the function was to ensure that the object is created and retry in case of webhook is temporary not available. The original code lacks handling already existing object. Handling with and without webhook configurations make the things more complicated. Let's check Create() errors for different cases: If webhook enabled: - no error (err == nil) - 500 InternalError likely if webhook is not available (yet) - 403 Forbidden if webhook blocks creation (check of existance) - some problem (real error) else, if webhook disabled: - no error (err == nil) - 409 AlreadyExists if object exists - some problem (real error) Check already existing object after Create() with a call to Get(). It covers both with and without webhook configurations. Reuse the same object for Get() to avoid fetching Gvk for it. It doesn't make harm, the object structure is not used after creation. 500 InternalError is not 100% webhook problem, but consider it as a reason for retry. Fixes: e26100e ("upgrade: retry if default DSCI creation fails (opendatahub-io#1008)") Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
CreateWithRetry() now checks AlreadyExists condition inside, so skip its handling. sleep() is not needed for DSCI with proper working CreateWithRetry since it is the actual point of existance of the function. Keep checking number of DSCI instances for non-webhook configuration. Basically, using CreateWithRetry for DSC is redundant from webhook unavailability point of view since it is created after DSCI which garantees working webhook. But it handles already existed object. Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
c0540b9
to
22297df
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: zdtsw The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
5c755e6
into
opendatahub-io:incubation
* cluster: CreateWithRetry: make work for already existing object Intention of the function was to ensure that the object is created and retry in case of webhook is temporary not available. The original code lacks handling already existing object. Handling with and without webhook configurations make the things more complicated. Let's check Create() errors for different cases: If webhook enabled: - no error (err == nil) - 500 InternalError likely if webhook is not available (yet) - 403 Forbidden if webhook blocks creation (check of existance) - some problem (real error) else, if webhook disabled: - no error (err == nil) - 409 AlreadyExists if object exists - some problem (real error) Check already existing object after Create() with a call to Get(). It covers both with and without webhook configurations. Reuse the same object for Get() to avoid fetching Gvk for it. It doesn't make harm, the object structure is not used after creation. 500 InternalError is not 100% webhook problem, but consider it as a reason for retry. Fixes: e26100e ("upgrade: retry if default DSCI creation fails (red-hat-data-services#1008)") Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com> * upgrade: cleanup CreateWithRetry usage CreateWithRetry() now checks AlreadyExists condition inside, so skip its handling. sleep() is not needed for DSCI with proper working CreateWithRetry since it is the actual point of existance of the function. Keep checking number of DSCI instances for non-webhook configuration. Basically, using CreateWithRetry for DSC is redundant from webhook unavailability point of view since it is created after DSCI which garantees working webhook. But it handles already existed object. Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com> --------- Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com> (cherry picked from commit 5c755e6)
* cluster: CreateWithRetry: make work for already existing object Intention of the function was to ensure that the object is created and retry in case of webhook is temporary not available. The original code lacks handling already existing object. Handling with and without webhook configurations make the things more complicated. Let's check Create() errors for different cases: If webhook enabled: - no error (err == nil) - 500 InternalError likely if webhook is not available (yet) - 403 Forbidden if webhook blocks creation (check of existance) - some problem (real error) else, if webhook disabled: - no error (err == nil) - 409 AlreadyExists if object exists - some problem (real error) Check already existing object after Create() with a call to Get(). It covers both with and without webhook configurations. Reuse the same object for Get() to avoid fetching Gvk for it. It doesn't make harm, the object structure is not used after creation. 500 InternalError is not 100% webhook problem, but consider it as a reason for retry. Fixes: e26100e ("upgrade: retry if default DSCI creation fails (#1008)") Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com> * upgrade: cleanup CreateWithRetry usage CreateWithRetry() now checks AlreadyExists condition inside, so skip its handling. sleep() is not needed for DSCI with proper working CreateWithRetry since it is the actual point of existance of the function. Keep checking number of DSCI instances for non-webhook configuration. Basically, using CreateWithRetry for DSC is redundant from webhook unavailability point of view since it is created after DSCI which garantees working webhook. But it handles already existed object. Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com> --------- Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com> (cherry picked from commit 5c755e6)
Description
Make CreateWithRetry work for already existing object and remove extra handling for AlreadyExists in the caller code. This does not work for webhook case anyway.
How Has This Been Tested?
Tested webhook configuration:
Error creating object: Internal error occurred: failed calling webhook "operator.opendatahub.io": failed to call webhook: Post "https://opendatahub-operator-webhook-service.opendatahub-operator-system.svc:443/validate-opendatahub-io-v1?timeout=10s": no endpoints available for service "opendatahub-operator-webhook-service". Retrying...
oc get dsci
shows default-dsci ReadyResource default-dsci already exists. It will not be updated with default values
Screenshot or short clip
Merge criteria