-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent alerts from inhibiting themselves #1017
Conversation
We've already discussed this elsewhere and decided that this is not worth making an exception for. The desired effect can be achieved by an appropriate label layout. |
Not trying to be obnoxious, but I don't understand why #666 is still open in this case. I.e. if this is the intended behavior and it is documented as such. |
It's still open pending documentation to clarify that this is the behaviour. |
Well, AFAICT https://prometheus.io/docs/alerting/configuration/#<inhibit_rule> already has this note. |
Since there seem to be no takers, I'm closing this. :o) |
IIRC the outcome was that the desired effect cannot be achieved by an appropriate label layout. The stated reason to not prevent alerts from inhibiting itself was that the implementation would be too tricky. From the user's perspective, it does never make sense for an alert to inhibit itself. I would at least like to have one of the Alertmanager people check if this attempt pulls it off. @stuartnelson3 @brancz @fabxc |
I don't see that outcome, merely that one specific way of doing it doesn't work. Have you tried something along the lines of severity=page and severity=page-nodedown? |
Our intention is to not bloat our severity levels. |
The proposal to change the inhibition bloats the alertmanger config with knowledge of alertnames, and I expect will be more fragile overall to keep in sync. The expected way to use inhibition is via severity labels and equivalents, I'm generally against changing things merely because a user doesn't like the recommended way of doing things. |
For context, I realized after I submitted this PR that it is not a complete fix. E.g. if you have 2 ALERT metrics that match |
Isn't that the intended behavior? Could you give a full example (with the full inhibition rule) so that I can understand what's the problem with that? |
Off the top of my head, say you had NodeDownReasonFoo and NodeDownReasonBar which you wanted either to inhibit all other alerts from a machine. Even if self-inhibition was stopped, if both were firing for a node then both would get inhibited and you'd get no notifications. My gut feeling is that this isn't solvable in general, though I haven't thought through it fully. |
How about this take: Any alert that is matched by the source matchers of a given inhibition rules is automatically prevented from matching the target side of the same inhibition rule. |
At a first pass that'd seem to do the right thing, I'm still not convinced the setups it enables are a good idea though. |
An alert inhibiting so far has been unexpected and unintuitive behavior 100% of the time so far. I see little reason to keep it around just for the sake of it. If there's an intuitive and correct solution, I'd be in favor of it. |
I've taken @beorn7's suggestion and came up with a one-line change that prevents alerts that match both the source and target filters from inhibiting themselves. You can still set up multiple inhibition rules (such as A inhibits B, B inhibits A) that will essentially end up inhibiting all alerts, but you can't do it with a single '<inhibit_rule>' anymore. Progress. :o) |
Seems like a sound solution to me, but I've not been involved in the discussion so far, so I leave it to the others to decide. The circular inhibition issue already exists, so 🤷♀️ from me. |
I'll have a look ASAP… |
I do not think it means what you think it means. Sorry, I just had to take that. Low hanging fruit and all... :o) Any chance of looking at it though? I've added a comment to get CircleCI to retry so it's now a 2 line change, but only one is code. |
I know exactly what it means, sadly… |
I think this semantics makes sense. 👍 from my side. |
@stuartnelson3 I'm not sure if your last comment here including that emoji (purple angel?) means approval or not. If you agree with this change, please simply merge. |
Sorry for the confusion, it was supposed to be the shrugging emoji. I'll go ahead and merge, thanks for reviewing this. |
Whee! Thanks! |
Signed-off-by: Dan Fredell <Dan.Fredell@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
This resolves #535.
Simply checks for the fingerprint of the potentially inhibiting alert not being equal to the inhibited alert.