-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus stops firing but alerts stay active #815
Comments
Prometheus built from current master:
Alertmanager built from current master:
Prometheus config:
Alertmanager config:
After the last step, the alert clears in Prometheus, but stays visible on the Alertmanager UI even after 20 minutes. The webhook notification keeps getting sent never-endingly repeating, but with a
|
The display issue is theoretically unrelated as we simply forgot to filter resolved alerts in the new API endpoint. So they remain visible until they are garbage collected. The re-sending might be related to the fixes we made for 0.6 already. If it's only occurring in 0.7 I'd be surprised. Any chance you can verify that? |
Checking again now with 0.6. |
So actually part of the weird behavior I got earlier from master was that I used Now I'm using a small Go web server that just responds with This is from current master:
So in current master the only thing that's clearly a bug is the UI! And in 0.6.2, that is also the only behavioral difference - in 0.6.2, the alert disappears from the UI immediately after being resolved in Prometheus. |
That's correct, otherwise we'd spam users as each individual resolution came in. Resolved notifications need to obey group_interval. |
Okay, what you describe is the expected behavior. Curious about cases now
though where people report resolved being spammed endlessly into Slack.
For the UI, there's already a fixing PR open.
…On Fri, May 26, 2017 at 3:20 PM Brian Brazil ***@***.***> wrote:
(I assume that is even intentional because group_interval: 5m also applies
when all alerts in a group are resolved?)
That's correct, otherwise we'd spam users as each individual resolution
came in. Resolved notifications need to obey group_interval.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#815 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEuA8oLH_-tEA3LvuoKrJegfGq6d7Nlpks5r9tGdgaJpZM4NiCej>
.
|
We did have a bugs fixed in 0.6 around resolved notifications, so it could be those. |
#820 indeed fixes the UI issues for me. So from my side, I cannot reproduce this issue anymore. Should we close it and see if someone else reports similar issues again? |
Fine with me.
…On Fri, May 26, 2017 at 3:39 PM Julius Volz ***@***.***> wrote:
#820 <#820> indeed fixes
the UI issues for me. So from my side, I cannot reproduce this issue
anymore. Should we close it and see if someone else reports similar issues
again?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#815 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEuA8tiJQv0FP8qvMkZqHGh9GMf0MoEeks5r9tX5gaJpZM4NiCej>
.
|
@juliusv I'm seeing the repeating "resolved" notification firing into Slack right now, with two different Prometheus installations. What information can I give you to try and nail this down? Once the alert resolves, I see nothing in either Prometheus or Alertmanager UIs. The following is the alert manager log in debug mode for the repeating notifications.
|
I've just tried with an alertmanager built from master and observe the same behaviour. The offending alert, once resolved, notifies "resolved" every repeat_interval. Route and Receivers config:
|
Fix log level regression in prometheus#533
I want to know when kubernetes pod was down get alert in slack or email. |
@bala0409 Thanks for your interest. It looks as if this is actually a question about usage and not development To make your question, and all replies, easier to find, we suggest you move this over to our user mailing list, which you can also search. If you prefer more interactive help, join or our IRC channel, #prometheus on irc.freenode.net. Please be aware that our IRC channel has no logs, is not searchable, and that people might not answer quickly if they are busy or asleep. If in doubt, you should choose the mailing list. If you think this is not purely a support question, feel free to comment in here or take the underlying issues to our developer mailing list. |
This issue was reported by Jack and Julius here:
The text was updated successfully, but these errors were encountered: