-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: tuning resource usage for operator pod #1120
chore: tuning resource usage for operator pod #1120
Conversation
Is this related to https://issues.redhat.com/browse/RHOAIENG-9806, as some preliminary stop-gap?
Wouldn't data from small clusters be more relevant for the actual value we aim for? Do we need for example CPU And vice versa -- do we need to lower the limit at all? |
I would take a step by step to see if this can make the "large" cluster working first, then we can go even more fine tuning to do the low boundary for "small" cluster.
to have a high "limit" (to keep what we have now) i would not say do much harm, but it impacts k8s node selection. ofc, if we are talking about SNO i guess there is no such needs for consideration. lower or higher "limit" is the same |
requests: | ||
cpu: 500m | ||
cpu: 25m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should only update requests instead of limits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am confused for this part. I've left some question/comments in the jira and slack, would be good to understand these data first.
I agree with @adelton to use data from small clusters to set defaults. The jira issue linked has data from PSAP team |
tbh, when i started this PR, i did not know this jira ticket. one thing on my mind after reading your comments: |
For the benefit of the folks who might not have access to the internal information, it might be useful to get the data from an ODH installation and share them here or in some other public place, so that the reasons for the numerical changes are documented. I would assume the numbers from ODH and downstream don't differ much, so if we can use and publish the numbers we got for downstream really depends on whether they are considered internal-only or not. |
Is this also related to https://issues.redhat.com/browse/RHOAIENG-494? |
- reduce cpu and mem usage in requests from profiling data Signed-off-by: Wen Zhou <wenzhou@redhat.com>
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
I dont think so, but more for https://issues.redhat.com/browse/RHOAIENG-9806 |
And specifically https://issues.redhat.com/browse/RHOAIENG-10889, it seems. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adelton The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
d6ae248
into
opendatahub-io:incubation
…1120) * chore: tuning resource usage for operator pod - reduce cpu and mem usage in requests from profiling data Signed-off-by: Wen Zhou <wenzhou@redhat.com> * update: data from 2.11 with all default components up Signed-off-by: Wen Zhou <wenzhou@redhat.com> --------- Signed-off-by: Wen Zhou <wenzhou@redhat.com> (cherry picked from commit d6ae248)
* chore: tuning resource usage for operator pod - reduce cpu and mem usage in requests from profiling data Signed-off-by: Wen Zhou <wenzhou@redhat.com> * update: data from 2.11 with all default components up Signed-off-by: Wen Zhou <wenzhou@redhat.com> --------- Signed-off-by: Wen Zhou <wenzhou@redhat.com> (cherry picked from commit d6ae248)
Description
related to https://issues.redhat.com/browse/RHOAIENG-9806
How Has This Been Tested?
Screenshot or short clip
Merge criteria