Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Ability to configure resources in Kserve #965

Conversation

VaishnaviHire
Copy link
Member

@VaishnaviHire VaishnaviHire commented Apr 11, 2024

Description

This is a very targeted change for Kserve component. The PR adds ability to configure resources field in Deployments objects of the component.

Eventually we will move to a generalize solution to whitelist fields for all components. See [wip]ADR

Jira Issue: https://issues.redhat.com/browse/RHOAIENG-280

How Has This Been Tested?

  1. Deploy custom operator image
  2. Set Kserve component to Managed in DSC
  3. Update resources field in kserve-controller-manager Deployment
  4. Verify values are not overwritten
  5. Update resources field in Deployment of any other component
  6. Verify values are overwritten

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

return nil
}
// do not patch resources field in Kserve deployment i.e allows users to update resources field
if found.GetKind() == "Deployment" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant check, it's done in the function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
can remove the in-function check if we know it is deployment already from the caller

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather leave it in the func and remove it from here :) the func says "deployment" in the name, so it needs to make sure whatever is passed is a deployment.

@ykaliuta
Copy link
Contributor

I really hope it is temporary :)

@ykaliuta
Copy link
Contributor

@bartoszmajsak @LaVLaS so, what's you opinion, is it "plugin" or "patching"? ;)

@zdtsw
Copy link
Member

zdtsw commented Apr 11, 2024

I have tested the build, i think the supported case for kserve will be:
user can remove the entire resource block from Kserve deployment, or set to proper value in resource block as they wish.
Operator will not reset it back to original value set in the manifests, As long as:

  • kserve deployment not get deleted when kserve component is still Managed
  • kserve component not flip to Remove then back to Managed again

@VaishnaviHire
Copy link
Member Author

I really hope it is temporary :)

It is. The reason for this being temporary fix is we do not yet have a requirement or list of accepted whitelisted fields for other components.

The long term solution I see is to introduce a custom plugin that takes up list of whitelisted fields and applies them to all the resources,

@VaishnaviHire VaishnaviHire force-pushed the update_resources_kserve branch from bb142a0 to cc54b6a Compare April 11, 2024 14:04
@VaishnaviHire
Copy link
Member Author

I have tested the build, i think the supported case for kserve will be: user can remove the entire resource block from Kserve deployment, or set to proper value in resource block as they wish. Operator will not reset it back to original value set in the manifests, As long as:

  • kserve deployment not get deleted when kserve component is still Managed
  • kserve component not flip to Remove then back to Managed again

@israel-hdez Can you confirm if this behavior for deployment updates look good to you?

@bartoszmajsak
Copy link
Contributor

I really hope it is temporary :)

There's nothing as permanent as a temporary solution ;) We have to be vigilant and pay this back sooner than later.

Copy link
Contributor

@bartoszmajsak bartoszmajsak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this based on the assumption it's urgently needed ;)

Can we link here an issue which captures the desired solution?

return nil
}
// do not patch resources field in Kserve deployment i.e allows users to update resources field
if found.GetKind() == "Deployment" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather leave it in the func and remove it from here :) the func says "deployment" in the name, so it needs to make sure whatever is passed is a deployment.

@bartoszmajsak
Copy link
Contributor

@bartoszmajsak @LaVLaS so, what's you opinion, is it "plugin" or "patching"? ;)

These are two different things :)

@VaishnaviHire VaishnaviHire force-pushed the update_resources_kserve branch from cc54b6a to 6e9ac29 Compare April 11, 2024 14:57
@openshift-ci openshift-ci bot removed the lgtm label Apr 11, 2024
@VaishnaviHire VaishnaviHire force-pushed the update_resources_kserve branch from 6e9ac29 to 0b7e6d3 Compare April 11, 2024 14:58
Copy link

openshift-ci bot commented Apr 11, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bartoszmajsak, zdtsw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@VaishnaviHire
Copy link
Member Author

/test opendatahub-operator-e2e

@VaishnaviHire
Copy link
Member Author

@dhirajsb We see e2e failures when deploying Model registry controller image

Failed to pull image "quay.io/opendatahub/model-registry-operator:latest": reading manifest latest in quay.io/opendatahub/model-registry-operator: unknown: Tag latest was deleted or has expired. To pull, revive via time machine

Do we need to update the default branch or the tag needs to be updated??

@israel-hdez
Copy link
Contributor

I have tested the build, i think the supported case for kserve will be: user can remove the entire resource block from Kserve deployment, or set to proper value in resource block as they wish. Operator will not reset it back to original value set in the manifests, As long as:

  • kserve deployment not get deleted when kserve component is still Managed
  • kserve component not flip to Remove then back to Managed again

@israel-hdez Can you confirm if this behavior for deployment updates look good to you?

I just tried, and it is working fine.
Also, the described behavior sounds good to me to properly solve RHOAIENG-280.

@VaishnaviHire
Copy link
Member Author

/test opendatahub-operator-e2e

@openshift-merge-bot openshift-merge-bot bot merged commit 5b6147b into opendatahub-io:incubation Apr 12, 2024
8 checks passed
VaishnaviHire added a commit to VaishnaviHire/opendatahub-operator that referenced this pull request Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants