Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: have pre-req retry upon check fail #574

Merged
merged 1 commit into from
Feb 21, 2024

Conversation

HumairAK
Copy link
Contributor

@HumairAK HumairAK commented Feb 20, 2024

The issue resolved by this Pull Request:

Resolves https://issues.redhat.com/browse/RHOAIENG-2099

Same as: #571 for v1.6

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
@dsp-developers
Copy link
Contributor

A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-574
An OCP cluster where you are logged in as cluster admin is required.

To use this image run the following:

cd $(mktemp -d)
git clone git@github.com:opendatahub-io/data-science-pipelines-operator.git
cd data-science-pipelines-operator/
git fetch origin pull/574/head
git checkout -b pullrequest fd3aae9110a4857e71e81f8ab9774e13186d389f
oc new-project opendatahub
make deploy IMG="quay.io/opendatahub/data-science-pipelines-operator:pr-574"

More instructions here on how to deploy and test a Data Science Pipelines Application.

Copy link
Contributor

@amadhusu amadhusu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't face the issue mentioned in Jira as the Database was available within 1 second after the failure of the Database Health Check as you can see in the screenshot.

Screenshot from 2024-02-21 16-45-27

My question is only regarding the aggressive 'RequeAfter' duration of 20 seconds. Is it more of a end-user experience kind of thing to go ahead with provisioning the rest of the pods with DSPA once the Database Health Check passes by reconciling aggressively? Any performance hiccups with such a short duration for reconciliation would be my only question. This can be taken offline but the code looks perfect and Works perfectly fine with sanity checks.

@gregsheremeta
Copy link
Contributor

/lgtm

it's a little more common to do exponential backoff in controllers. Can be a future improvement (or not).

@gregsheremeta
Copy link
Contributor

/approve

Copy link
Contributor

openshift-ci bot commented Feb 21, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: amadhusu, gregsheremeta

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@HumairAK HumairAK merged commit ef6fd1e into opendatahub-io:v1.6.x Feb 21, 2024
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants