-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Monitoring] Investigate/discuss the consequences of enabling alert management from within the actions UI #92128
Comments
Pinging @elastic/stack-monitoring (Team:Monitoring) |
As discussed, we will only be focusing on the first issue (creating/modifying/notifying new alert instances). I will list the following things that we will need in order to make it work:
I think the UI/UX (no. 3) will be the most complex to figure out, and might take some discussion around what the flow/experience should be. We will also need "smart" logic to only show the most critical to reduce dups/noise. 1 and 2 are pretty trivial and shouldn't take too long to implement/test. That said I think we should get 1 & 2 merged in as soon as possible and then focus on the UI/UX part, since we can sneak it in other minors (if need be) as an enhancement. cc: @ravikesarwani @chrisronline @jasonrhodes feel free to discuss/add to the list/points mentioned |
I have concluded that we should revert alert management (for now). I was testing this locally and it was working at one point, however, when I point it to the edge environment I got different results. Mainly alerts going into active states and becoming sticky if modified/created from Stack Management UI. There's also a race conditions with This is too risky to try to merge, and still needs a little more investigation |
That is unfortunate. I guess we can revert but continue to work on it to see if we can fix the issues we have found (alerts going into active states and becoming sticky if modified/created from Stack Management UI). |
I'm interested in getting more info on this issue, since it does seem to be confusing to customers. I'm hoping we can find some functionality in the alerting framework to help you out. One capability we recently added is being able to do flattened (keyword) searches over parameters. This kinda solves an open issue we had to allow solutions to add some data to an alert they could later use for their own search purposes. SIEM needs to do this, and added their own generated entries to the What you can do now is create a new parameter, which isn't exposed in the UI, and should probably be Then at least you can distinguish between solution-created and alert-management-page-created (or HTTP API created) alerts. I'm not quite sure why you need to distinguish these though, especially since it appears you want to allow multiple alerts of the same alert type to co-exist. This bit seems really hard:
We don't have anything to help with coordinating actions between independent alerts. I have no idea how you'd easily be able to only show the "highest" trigger point here, without a second layer of persistence that wrote all the values out, and then another alert that would read those values, and then only send the highest, based over some time window. Super-complicated. And there are certainly use cases where you'd want all the alerts to fire anyway - each individual alert might have an action that sends emails to different people, different slack channel, etc. We do have a notion of alert groups, so you could define 3 "levels" as different action groups, each with an associated, monotonically increasing threshold. Call them |
@igoristic let's schedule a chat so you can show me exactly what broke here and we can figure out a way forward -- I'll schedule |
Alerts in Stack Monitoring weren't designed to be managed outside the the SM app. There are few problems I notices when enabling alerts' management from the actions menu in Stack Management. We had to do this in order to accommodate the use cases described here: #91145
Main issues are:
alert.alertTypeId
and we only use the first item fromapi/alerts/_find
this means we ignore any other instances that might be created.Possible solution: Go through all the alerts when searching by
alertTypeId
and create instances prefixing thealert.id
instead.Possible solution: I think we should implement an ability do distinguish "manageable" (created/managed in the actions ui) alerts vs "preconfigured" alerts (managed by SM ui)
I think we should solve these issues as soon as possible, so they can make it into one of the 7.12.x release
The text was updated successfully, but these errors were encountered: