-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Component rules in Prometheus #1503
Conversation
Dockerfiles/Dockerfile
Outdated
# Copy monitoring config | ||
COPY config/monitoring/ /opt/manifests/monitoring | ||
# Copy partners config | ||
COPY config/partners/ /opt/manifests/partners |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can skip partners one, with #1504
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can skip partners one, with #1504
Why for building it cannot copy just everything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the question about why not make it as COPY config/ /opt/manifests/
in Dockefile?
if so, we have more under config
folder which no need to get into final image
60d1f2c
to
ef677d0
Compare
9da2003
to
b9d91bd
Compare
/test opendatahub-operator-e2e |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1503 +/- ##
==========================================
- Coverage 20.04% 19.69% -0.35%
==========================================
Files 160 161 +1
Lines 10900 11102 +202
==========================================
+ Hits 2185 2187 +2
- Misses 8483 8683 +200
Partials 232 232 ☔ View full report in Codecov by Sentry. |
controllers/services/monitoring/monitoring_controller_actions.go
Outdated
Show resolved
Hide resolved
}), | ||
) | ||
err := cr.ForEach(func(ch cr.ComponentHandler) error { | ||
enabled := cr.IsManaged(ch, dsc) && dsc.Status.Phase == status.ReadySuffix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am not sure about this part "dsc.Status.Phase == status.ReadySuffix"
do we need a special case on odh-model-controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Status is for Dsc, so we should not need any additional condition for odh-model-controller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get the actual component status, a better way would be:
ci := component.NewCRObject(instance)
// read the component instance to get tha actual status
err = cli.Get(ctx, client.ObjectKeyFromObject(ci), ci)
switch {
case k8serr.IsNotFound(err):
enabled = false
case err != nil:
enabled = false
// TODO: error handling
default:
enabled = meta.IsStatusConditionTrue(ci.GetStatus().Conditions, status.ConditionTypeReady)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking a little bit more about this comment, we probably want to distinguish between enabled and ready. If I recall the old implementation, we were supposed to install the rule after the first time the component become ready, and we remove the rules of the component gets removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like also mention this issue red-hat-data-services#316 which was from my opinion hidden by changing of timings.
controllers/services/monitoring/monitoring_controller_actions.go
Outdated
Show resolved
Hide resolved
71a8618
to
9db60ba
Compare
9db60ba
to
bcdd7c8
Compare
/test opendatahub-operator-e2e |
controllers/services/monitoring/monitoring_controller_actions.go
Outdated
Show resolved
Hide resolved
return nil | ||
} | ||
dsc := &dscList.Items[0] | ||
componentRules := map[string]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this map can probably be moved to a global var
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
} | ||
|
||
m.Status.Phase = "Ready" | ||
m.Status.ObservedGeneration = m.GetObjectMeta().GetGeneration() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally we should also have a Ready
condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add an NoReady case, only set to Ready when Monitoring Prom deployment is ready
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant a ready Condition, phase is not that much in use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, did not see this, let me get a new one next week for this comment
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
96b545c
to
cab3a3a
Compare
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
switch { | ||
case err != nil: | ||
enabled = false | ||
if !k8serr.IsNotFound(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this could have been a switch case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as here, will have the new commit to follow
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: zdtsw The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Description
The scope of this PR is reduced to only watch and update prometheus configmap for components.
How Has This Been Tested?
Deploy Managed Instance:
Expect alerts only for components that are in Managed state
Screenshot or short clip
Merge criteria