-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong receiver #1824
Comments
All data comes from alertmanager API. |
Each alert will appear seperately in every receiver it goes to as each receiver can have unique groupping configuration. There's nothing unexpected here. |
I see, however when I tried avoiding |
I'm not sure if I follow. You just pasted |
Ok. Sorry for wrong describing. |
Can you see those alerts with |
Yes, same alert with same receiver should be in karma. Is there any mismatch between what alertmanager UI shows and what karma UI shows for that receiver? |
Yes. I received this alert only over |
Can you share configuration for karma and alertmanager routing tree? Any errors in karma logs? What alertmanager version are you using? |
FYI there were some bugs fixed in alertmanager 0.19 in case you are running older version (#812) |
Karma v0.63: ---
alertmanager:
servers:
- name: fr5
uri: http://server-1:9093
timeout: 5m
proxy: true
cors:
credentials: omit
- name: fr4
uri: http://server-2:9093
timeout: 5m
proxy: true
cors:
credentials: omit
alertAcknowledgement:
enabled: true
duration: 15m0s
author: Kafka Karma
commentPrefix: working on it
filters:
default:
- team="Team_Kafka"
karma:
name: Kafka Karma
annotations:
default:
hidden: false
order:
- alertname
- location
grid:
sorting:
order: startsAt
label: locationId
receivers:
keep:
# - team-kafka-wake-up
strip:
# - opsgenie
labels:
color:
static:
- alertname
unique:
- "@receiver"
- instance
ui:
alertsPerGroup: 10
collapseGroups: collapsed
multiGridLabel: location
log:
config: true
level: debug
format: text
timestamp: true Alertmanager v0.17: route:
group_wait: 30s
group_interval: 5m
repeat_interval: 1h
receiver: blackhole-slack
routes:
- group_by: ["..."]
match_re:
alertname: ".+"
match:
alerttype: "platform_host"
receiver: opsgenie-platform-host
- group_by: ["..."]
match_re:
alertname: ".+"
match:
alerttype: "platform_service"
receiver: opsgenie-platform-service
- group_by: ["..."]
match_re:
alertname: ".+"
receiver: opsgenie
continue: true
- match:
team: Team_1
group_by: ['service', 'location']
routes:
- match:
service_receiver: "true"
receiver: service-email
continue: false
- match:
severity: critical
receiver: platform-monitoring
repeat_interval: 4h
- match:
severity: warning
receiver: platform-notifs
- match:
team: Team_2
receiver: team-2
group_by: ['service', 'location']
- match:
team: Team_3
receiver: team-3
group_by: ['job', 'location']
- match:
team: Team_4
receiver: team-4
group_by: ['job', 'location']
- match:
team: Team_5
receiver: team-5
group_by: ['job', 'location']
- match:
team: Team_Kafka
receiver: team-kafka
group_by: ['job', 'location']
routes:
- match:
location: io
receiver: team-kafka-io
- match:
severity: page
receiver: team-kafka-wake-up
- match:
severity: critical
receiver: team-kafka-wake-up
- match:
team: Team_6
location: io
receiver: team-6
group_by: ['service', 'location']
- match:
team: Team_7
receiver: team-7
group_by: ['service', 'location']
- match:
team: Team_8
receiver: team_8
group_by: ['service', 'location']
- match:
team: Team_9
receiver: team_9
group_by: ['service', 'location']
- match:
team: Team_10
receiver: team-10
group_by: ['service', 'location']
- match:
team: Team_11
receiver: team-11
group_by: ['service', 'location']
- match:
team: Team_12
receiver: team-12
group_by: ['job']
routes:
- match:
alert_receiver: internal
receiver: team-12
- match:
alert_receiver: team-12-ss7
receiver: team-12-ss7
continue: true
- receiver: team-12-1
continue: true
- receiver: core-12-2
- match:
team: Team_13
receiver: team-13
group_by: ['job', 'location']
- match:
team: Team_14
type: warn
location: io
receiver: team-14-warning
group_by: ['job', 'location']
- match:
team: Team_14
type: suspend
location: io
receiver: team-14-suspend
group_by: ['job', 'location']
- match:
team: Team_14
type: warn
location: fr
receiver: team-14-fr-warning
group_by: ['job', 'location']
- match:
team: Team_14
type: suspend
location: fr
receiver: team-14-fr-suspend
group_by: ['job', 'location']
- match:
team: Team_14
receiver: team-14
group_by: ['job', 'location']
- match:
team: Team_15
group_wait: 1s
receiver: team-15
repeat_interval: 30m
group_by: ['job', 'location']
- match:
team: Team_16
receiver: team-16
group_by: ['service', 'location']
- match:
team: Team_17
receiver: team-17
repeat_interval: 30m
group_by: ['job', 'location']
- match:
team: Team_18
receiver: team-18
repeat_interval: 30m
group_by: ['service', 'location']
- match:
team: Team_19
receiver: team-19
repeat_interval: 30m
group_by: ['service', 'location']
- match:
team: Team_20
receiver: team-20
group_by: ['job', 'location']
- match:
team: Team_21
severity: critical
receiver: team-21-mail
group_by: ['service', 'location']
repeat_interval: 600m
- match:
team: Team_22
receiver: team-22
repeat_interval: 30m
group_by: ['service', 'location']
- match:
team: centili-support
receiver: centili-support
repeat_interval: 30m
group_by: ['job']
- match:
team: Team_23
receiver: team-23
repeat_interval: 30m
group_by: ['service', 'location']
- match:
alertname: prometheus_dead_mans_switch
receiver: prometheus-heartbeat
group_wait: 30s
group_interval: 30s
repeat_interval: 30s
group_by: ["alertname"]
- match:
team: Team_24
receiver: team-24
repeat_interval: 30m
group_by: ['service', 'location']
- match:
team: Team_25
receiver: team-25-mail
group_by: ['location']
- match:
team: Team_26
receiver: team-26
group_by: ['location']
- match:
team: Team_27
receiver: team-27
repeat_interval: 60m
group_by: ['alertname', 'instance', 'service', 'location']
- match:
team: Team_28
receiver: team-28
group_by: ['service', 'location']
- match:
team: Team_29
receiver: Team-29
repeat_interval: 60m
group_by: ['alertname', 'instance', 'service', 'location']
routes:
- match_re:
severity: warning|critical
receiver: core-performance-voice
continue: true
- match:
team: Team_30
receiver: team-30
repeat_interval: 8760h
group_by: ['location']
- match:
team: Team_31
receiver: team-31
repeat_interval: 60m
group_by: ['location']
- match:
team: Team_32
receiver: team-32
repeat_interval: 8737h
group_by: ['alertname','service', 'location', 'job', 'number']
- match:
team: Team_33
receiver: team-33
group_by: ['service', 'location']
repeat_interval: 60m
- match:
team: Team_34
receiver: team-34
group_by: ['service', 'location']
- match:
team: Team_35
receiver: team-35
group_by: ['service', 'location']
- match:
team: Team_36
receiver: team-36
repeat_interval: 30m
group_by: ['service', 'location']
- match:
team: Team_37
receiver: team-37
group_by: ['instance', 'service', 'location']
- match:
team: Team_38
receiver: team-38
group_by: ['service', 'client', 'location']
- match:
team: Team_39
receiver: team-39
group_by: ['location'] Don't see any errors in log: time="2020-06-11T13:30:16Z" level=info msg="[fr4] Upstream version: 0.17.0"
time="2020-06-11T13:30:16Z" level=warning msg="Alertmanager 0.17.0 might return incomplete list of alert groups in the API, please upgrade to >=0.19.0, see https://github.com/prymitive/karma/issues/812"
time="2020-06-11T13:30:16Z" level=info msg="[fr5] Got 66 silences(s) in 6.847856ms"
time="2020-06-11T13:30:16Z" level=info msg="[fr5] Detecting ticket links in silences (66)"
time="2020-06-11T13:30:16Z" level=info msg="[fr4] Got 66 silences(s) in 8.033665ms"
time="2020-06-11T13:30:16Z" level=info msg="[fr4] Detecting ticket links in silences (66)"
time="2020-06-11T13:30:16Z" level=info msg="[fr5] Got 1154 alert group(s) in 280.569968ms"
time="2020-06-11T13:30:16Z" level=info msg="[fr5] Deduplicating alert groups (1154)"
time="2020-06-11T13:30:16Z" level=info msg="[fr5] Processing unique alert groups (1139)"
time="2020-06-11T13:30:16Z" level=info msg="[fr5] Merging autocomplete data (4538)"
time="2020-06-11T13:30:17Z" level=info msg="[fr4] Got 1066 alert group(s) in 885.071458ms"
time="2020-06-11T13:30:17Z" level=info msg="[fr4] Deduplicating alert groups (1066)"
time="2020-06-11T13:30:17Z" level=info msg="[fr4] Processing unique alert groups (1052)"
time="2020-06-11T13:30:17Z" level=info msg="[fr4] Merging autocomplete data (4406)"
time="2020-06-11T13:30:17Z" level=info msg="Pull completed"
time="2020-06-11T13:30:17Z" level=info msg="Done, starting HTTP server"
time="2020-06-11T13:30:17Z" level=info msg="Listening on 0.0.0.0:80"
time="2020-06-11T13:30:22Z" level=debug msg="Compressed 9007 bytes to 2405 bytes (26.70%)"
time="2020-06-11T13:30:22Z" level=info msg="[10.0.0.1 MIS] <200> GET /alerts.json?&gridLabel=location&gridSortReverse=0&sortOrder=&sortLabel=&sortReverse=&q=%40receiver%3Dteam-kafka-wake-up took 26.304313ms"
time="2020-06-11T13:30:25Z" level=debug msg="Compressed 2185257 bytes to 126630 bytes (5.79%)"
time="2020-06-11T13:30:25Z" level=info msg="[10.0.0.1 MIS] <200> GET /alerts.json?&gridLabel=location&gridSortReverse=0&sortOrder=&sortLabel=&sortReverse=& took 405.209777ms"
time="2020-06-11T13:30:35Z" level=debug msg="Compressed 30889 bytes to 3491 bytes (11.30%)"
time="2020-06-11T13:30:35Z" level=info msg="[10.0.0.1 MIS] <200> GET /alerts.json?&gridLabel=location&gridSortReverse=0&sortOrder=&sortLabel=&sortReverse=&q=team%3DTeam_Kafka took 26.481914ms"
time="2020-06-11T13:31:17Z" level=info msg="Pulling latest alerts and silences from Alertmanager"
time="2020-06-11T13:31:17Z" level=info msg="[fr5] Collecting alerts and silences"
time="2020-06-11T13:31:17Z" level=info msg="GET http://server-1:9093/metrics timeout=5m0s"
time="2020-06-11T13:31:17Z" level=info msg="[fr4] Collecting alerts and silences"
time="2020-06-11T13:31:17Z" level=info msg="GET http://server-2:9093/metrics timeout=5m0s"
time="2020-06-11T13:31:17Z" level=info msg="[fr4] Upstream version: 0.17.0"
time="2020-06-11T13:31:17Z" level=warning msg="Alertmanager 0.17.0 might return incomplete list of alert groups in the API, please upgrade to >=0.19.0, see https://github.com/prymitive/karma/issues/812"
time="2020-06-11T13:31:17Z" level=info msg="[fr5] Upstream version: 0.17.0"
time="2020-06-11T13:31:17Z" level=warning msg="Alertmanager 0.17.0 might return incomplete list of alert groups in the API, please upgrade to >=0.19.0, see https://github.com/prymitive/karma/issues/812"
time="2020-06-11T13:31:17Z" level=info msg="[fr4] Got 66 silences(s) in 8.54807ms"
time="2020-06-11T13:31:17Z" level=info msg="[fr4] Detecting ticket links in silences (66)"
time="2020-06-11T13:31:17Z" level=info msg="[fr5] Got 66 silences(s) in 9.957681ms"
time="2020-06-11T13:31:17Z" level=info msg="[fr5] Detecting ticket links in silences (66)"
time="2020-06-11T13:31:17Z" level=info msg="[fr5] Got 1153 alert group(s) in 240.502656ms"
time="2020-06-11T13:31:17Z" level=info msg="[fr5] Deduplicating alert groups (1153)"
time="2020-06-11T13:31:17Z" level=info msg="[fr5] Processing unique alert groups (1138)"
time="2020-06-11T13:31:17Z" level=info msg="[fr5] Merging autocomplete data (4546)"
time="2020-06-11T13:31:18Z" level=info msg="[fr4] Got 1065 alert group(s) in 810.822695ms"
time="2020-06-11T13:31:18Z" level=info msg="[fr4] Deduplicating alert groups (1065)"
time="2020-06-11T13:31:18Z" level=info msg="[fr4] Processing unique alert groups (1051)"
time="2020-06-11T13:31:18Z" level=info msg="[fr4] Merging autocomplete data (4418)"
time="2020-06-11T13:31:18Z" level=info msg="Pull completed"
time="2020-06-11T13:32:17Z" level=info msg="Pulling latest alerts and silences from Alertmanager"
time="2020-06-11T13:32:17Z" level=info msg="[fr5] Collecting alerts and silences"
time="2020-06-11T13:32:17Z" level=info msg="GET http://server-1:9093/metrics timeout=5m0s"
time="2020-06-11T13:32:17Z" level=info msg="[fr4] Collecting alerts and silences"
time="2020-06-11T13:32:17Z" level=info msg="GET http://server-2:9093/metrics timeout=5m0s"
time="2020-06-11T13:32:17Z" level=info msg="[fr4] Upstream version: 0.17.0"
time="2020-06-11T13:32:17Z" level=warning msg="Alertmanager 0.17.0 might return incomplete list of alert groups in the API, please upgrade to >=0.19.0, see https://github.com/prymitive/karma/issues/812"
time="2020-06-11T13:32:17Z" level=info msg="[fr5] Upstream version: 0.17.0"
time="2020-06-11T13:32:17Z" level=warning msg="Alertmanager 0.17.0 might return incomplete list of alert groups in the API, please upgrade to >=0.19.0, see https://github.com/prymitive/karma/issues/812" |
Do you want to say that my bug with receivers depends on Alertmanager version? |
There was a bug in Alertmanager <0.19.0 where API wouldn't return all receivers, see prometheus/alertmanager#1959. |
I see. Since I don't have permission to update Alertmanger in our company, I have to wait =( |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Hi. I've got strange situation.
I see the alert in Karma with non-legal
receiver
I expect to see
team-kafka-wake-up
thereThe text was updated successfully, but these errors were encountered: