Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#3121 breaks Slack integration targeting RocketChat servers #3546

Closed
ae-govau opened this issue Oct 5, 2023 · 6 comments
Closed

#3121 breaks Slack integration targeting RocketChat servers #3546

ae-govau opened this issue Oct 5, 2023 · 6 comments

Comments

@ae-govau
Copy link

ae-govau commented Oct 5, 2023

What did you do?

We use the Slack integration to send notifications to a RocketChat server, which is reasonably compatible with Slack for this purpose. This has worked well for 4+ years, however after our last update we keep getting an ever increasing list of "Resolved" alerts every 5 minutes.

What did you expect to see?

We expect to see an alert posted, then a resolved posted, then silence.

What did you see instead? Under which circumstances?

We see the alert posted, then the resolved posted over and over again (every 5 minutes) until the AlertManager instance is restarted.

Environment

We believe that change #3121 triggered this condition. That change checks a 200 status code to ensure if contains "ok": true (previously 200 meant OK without further checking).

Our RocketChat instances returns a different message on success:

$ curl ...
< HTTP/1.1 200 OK
< Content-Type: application/json
...
{"success":true}

We believe that this new code identifies this (incorrectly) as failed POST and thus doesn't remove the now resolved alert from its list, and dutifully tries again every 5 minutes for eternity.

(Credit to @bg-govau for pin-pointing the issue. PR forthcoming.)

@roidelapluie
Copy link
Member

This should be fixed in rocketchat, not alertmanager

@ae-govau
Copy link
Author

ae-govau commented Oct 5, 2023

I appreciate that view, but it's also worked just fine for the past 4+ years that we've been using it. It's the recent change to alertmanager which broke the behaviour, not the other way around (or I would be opening a PR there).

This is a very small change, we'd appreciate if you'd consider accepting it.

@grobinson-grafana
Copy link
Contributor

I'm afraid I agree with Julien here. Keep in mind this is a Slack integration. I appreciate that this has worked for you for the past 4 years, but that's purely coincidental rather than intentional.

I also disagree that Alertmanager should change its Slack integration to support both Slack and RocketChat. It's not meant to be a multi-vendor integration. If RocketChat want's to provide a Slack-compatible API then that's great, but it needs to be 100% compatible, including in how it sends responses.

@roidelapluie
Copy link
Member

I am also happy to support native rocketchat integration as we are using it internally at my company.

@TheMeier
Copy link
Contributor

TheMeier commented Nov 2, 2023

I am very keen on getting native rocketchat support, so I started some implementation. This is nowhere near finished. Especially documentation and tests are completely missing still. But it is functional.
But before I spend more time on this I wanted to ask some questions @roidelapluie

Is it ok to use the rocketchat SDK?
It looks quite different then most of the other integrations, bringing it's own rest-client, especially not supporting httpOpts
I needed to import the SDK models into config/notifiers.go, I wonder if that is acceptable.

Since the rocketchat SDK defines field.Short as bool instead of *bool there is no real nice way of checking if it was actually set.

Most integrations allow to configure secrets directly or via a file to load the secret from, I that a requirement?

This is what I currently have: https://github.com/prometheus/alertmanager/compare/main...TheMeier:alertmanager:rocketchat?expand=1
If you prefer I can open a draft pull request. But as I said this is nowhere near done.

@gotjosh
Copy link
Member

gotjosh commented Oct 23, 2024

Done with #3600

@gotjosh gotjosh closed this as completed Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants