fix(metric_alerts): Fix bug where transitioning from critical -> warning can still trigger a resolve action #29345
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Was investigating #27161, although I suspect that it's
already resolved since it's from back in July and we pushed out a fix in August.
I happened to figure out the error we see on our own alert in #team-engineering though. Just to
cover this issue again:
threshold again, but actually it triggers a resolve.
There are two parts to this issue.
thresholds will literally never fire or do anything. I have fixed this.
work. In our case, since critical has an action and warning doesn't, we end up just using the
critical action, which is tied to the resolve. This results in us showing a resolve notification.
But this could also happen if the user set up a critical action on slack and a warning action on email.
To fix this, when we transition from critical -> warning I only include the incident trigger
associated with the active warning trigger in
fired_incident_triggers
, and leave the criticaltrigger out. This means that if someone configures an alert with both warning and critical triggers,
but fails to set up an action on the warning trigger, then the only time the alert will fire is when
the critical alert initially fires.
I think this is most correct and realistically it doesn't make sense to set up a threshold without
an action. We should probably prevent alerts from being created like this in the UI.