-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: added routing tools to amtool #1511
Conversation
if we go that path, would a --debug option help too to see the routes taken by the labels? |
Hmm that sounds reasonable, but the issue is routes does not have any names so I'm not sure how should the output look like. |
also there is possibility to use something like https://github.com/xlab/treeprint |
Thanks for looking at this! I think the treeprint suggestion would be pretty cool, personally. I think there's a huge benefit to having this, and being able to do it locally is something a lot of people would like. Although the tree routing visualizer on the website does everything in the browser and doesn't store or transmit any of the entered data, people are still cautious about copying and pasting their information into a website. The functionality so far looks cool, but it would also be helpful to add something like |
I'm also thinking where this should live .. maybe it should be under |
Ok, no problem. I'll be happy to add the tree visualization option. I like the
and
? Thanks for comments |
for a rough UI how about:
We've been calling it a |
3764668
to
977b35d
Compare
Ok I have a new version according to your suggestion @stuartnelson3 I added to vendor the https://github.com/xlab/treeprint library The tree visualization could be better but the library is simple and works quite well. Output looks like this: $ ./amtool config routes --config.file test.yml
Routing tree:
.
└── {} continue: false receiver: blackhole
├── {service="api"} continue: true receiver: slack
│ ├── {severity="critical"} continue: true receiver: pagerduty
│ └── {severity="low"} continue: true receiver: mattermost
├── {service="database"} continue: true receiver: pagerduty
└── {service=~"^(?:.*)$"} continue: true receiver: email
└── {service=~"^(?:.*)$"} continue: true receiver: email
└── {service=~"^(?:.*)$"} continue: false receiver: email The tree output in test mode omits $ ./amtool config routes test --tree --config.file test.yml service=api severity=critical
Matching tree:
.
└── {}
├── {service="api"}
│ └── {severity="critical"} receiver: pagerduty
└── {service=~"^(?:.*)$"}
└── {service=~"^(?:.*)$"}
└── {service=~"^(?:.*)$"} receiver: email
pagerduty,email |
Cool, will take a look! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is SUPER cool, thanks for all the effort!
I wrote a few comments in line.
One thing that sticks out to me is how the default route will always just be {}
. Maybe we should make that into a special case and come up with something to print that indicates that this is the default, without the somewhat cryptic {}
that might confuse people who are unfamiliar with routing.
cli/routing.go
Outdated
|
||
func checkRoutingConfigInputFlags(alertmanagerURL *url.URL, configFile string) { | ||
if alertmanagerURL != nil && configFile != "" { | ||
kingpin.Fatalf("You can't use both --config.file and --alertmanager.url at the same time.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a config file stored locally which points to an alertmanager URL, which prevents me from looking at local changes via config file.
# ./amtool config routes --config.file=examples/ha/alertmanager.yml
amtool: error: You can't use both --config.file and --alertmanager.url at the same time.
I'm not sure how to check for it, but if both flags are provided then it makes sense to either warn or bail, otherwise I think the --config.file
flag should take precedence, since it can only be passed via command line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch.. I didn't think of this case.
to assume precedence of --config.file
seems reasonable but I'll take a look if it would be possible to check if the alertmanager.url
flag was set from config, than I could still bail on using both flags and just warn on overriding the config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at it more and I thing it's not worth the complexity to check if the flag was set from the config file. I'd than need to check if the value of alertmanager.url
changed from the default value and it would become hard to understand I think.
For now I'll stick to your suggestion and just print warning that --config.file
overrides the alertmanager-url
flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
cli/routing.go
Outdated
} | ||
|
||
func convertRouteToTree(route *dispatch.Route, tree treeprint.Tree) { | ||
routeDecription := fmt.Sprintf("%s continue: %t receiver: %s", route.Matchers.String(), route.Continue, route.RouteOpts.Receiver) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I run this against SoundCloud's config (which has many many routes), it ends up being fairly noisy. I'm willing to bet most people don't ever use continue
, so how about we only mention continue if its value is set to true?
Also, I'm thinking about maybe only showing the receiver when a user uses show
. When using test
, the matches get written out at the bottom so maybe it's not necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the continue
, I definitely agree that the output gets bit messy, but I thought it changes behavior of the matching so much that I'd better state it there.
On the other hand the default is false
so we can omit the default value. Good point I'll suppress it.
About the test
and showing a receiver in the tree. I think it doesn't hurt and it's easier to see from which branch comes which receiver (even though it matches all leafs top-down to receivers left-to-right). But if you want I can remove it.
Could you possibly share how the output looks like for such complex configuration? Just out of curiosity I was lazy to write more complex routing tree :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the test and showing a receiver in the tree. I think it doesn't hurt and it's easier to see from which branch comes which receiver (even though it matches all leafs top-down to receivers left-to-right). But if you want I can remove it.
We can leave it in for now and think about it a bit more.
Could you possibly share how the output looks like for such complex configuration? Just out of curiosity I was lazy to write more complex routing tree :)
It's a lot to sanitize, I'll figure out a way to get it to you. I'm on vacation for the next few days so it'll be early next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forget it if you'd need to sanitize it, I was just curious :)
cli/routing.go
Outdated
} | ||
|
||
func convertRouteToTree(route *dispatch.Route, tree treeprint.Tree) { | ||
routeDecription := fmt.Sprintf("%s continue: %t receiver: %s", route.Matchers.String(), route.Continue, route.RouteOpts.Receiver) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was your motivation behind using two spaces between the struct members?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I wanted to separate it more. Should I shrink it to single spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
playing around with it a bit. it's a lot of information so I understand wanting to give it some more space.
cli/test_routing.go
Outdated
func printMatchingTree(mainRoute *dispatch.Route, ls client.LabelSet) { | ||
tree := treeprint.New() | ||
getMatchingTree(mainRoute, tree, ls) | ||
fmt.Println("Matching tree:") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try Matching routes
to stick with the subcommand name
977b35d
to
78e0ed2
Compare
Thanks for the review hopefully I covered all your comments. The $ ./amtool config routes --config.file=doc/examples/simple.yml
Routing tree:
.
└── default-route receiver: team-X-mails
├── {service=~"^(?:^(foo1|foo2|baz)$)$"} receiver: team-X-mails
│ └── {severity="critical"} receiver: team-X-pager
├── {service="files"} receiver: team-Y-mails
│ └── {severity="critical"} receiver: team-Y-pager
└── {service="database"} receiver: team-DB-pager
├── {owner="team-X"} continue: true receiver: team-X-pager
└── {owner="team-Y"} receiver: team-Y-pager When both $ ./amtool config routes --config.file=doc/examples/simple.yml --alertmanager.url=http://localhost:9090 1>/dev/null
Warning: --config.file flag overrides the --alertmanager.url. I added also changelog entry and usage notes to the README. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bunch of comments. Very close! I'm super excited about this PR.
One note: It looks like you updated vendor/vendor.json
, but you still need to vendor the actual files.
cli/routing.go
Outdated
|
||
Will print whole routing tree in form of ASCII tree view. | ||
|
||
Routing is loaded from configuration file or remote Alertmanager. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Routing is loaded from a local configuration file or a running Alertmanager configuration.
cli/routing.go
Outdated
|
||
Routing is loaded from configuration file or remote Alertmanager. | ||
Use --config.file or --alertmanager.url flags to specify where to look for the configuration. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifying --config.file takes precedence over --alertmanager.url
cli/test_routing.go
Outdated
const routingTestHelp = `Test alert routing | ||
|
||
Will return receiver names which the alert with given labels resolves to. | ||
(If resolves to multiple receivers, they are printed out joined using ',' in order as defined in the routing tree) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parens ()
can be removed, and change If resolves
to If the labelset resolves
. I think joined using ','
can be removed, too.
cli/test_routing.go
Outdated
|
||
Routing is loaded from configuration file or remote Alertmanager. | ||
Use --config.file or --alertmanager.url flags to specify where to look for the configuration. | ||
Than specify labels defining the alert. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the same help text for this configuration description as in routingHelp
cli/test_routing.go
Outdated
func configureRoutingTestCmd(cc *kingpin.CmdClause, c *routingShow) { | ||
var routingTestCmd = cc.Command("test", routingTestHelp) | ||
|
||
routingTestCmd.Flag("verify.receivers", "Checks if specified receivers matches resolved receivers. (Returns exit status 1 if not)").StringVar(&c.expectedReceivers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the text in the parens can be changed to something like The command fails if the labelset does not route to the specified receivers.
. I think you don't need to have the parens, also.
cli/test_routing_test.go
Outdated
tests = append(tests, &routingTestDefinition{configFile: "testdata/conf.routing-reverted.yml", alert: client.LabelSet{"test": "2"}, expectedReceivers: []string{"test2", "test1"}}) | ||
tests = append(tests, &routingTestDefinition{configFile: "testdata/conf.routing.yml", alert: client.LabelSet{"test": "volovina"}, expectedReceivers: []string{"default"}}) | ||
|
||
fmt.Printf("Running routing-test testsuit:\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed
cli/test_routing_test.go
Outdated
tests = append(tests, &routingTestDefinition{configFile: "testdata/conf.routing.yml", alert: client.LabelSet{"test": "1"}, expectedReceivers: []string{"test1"}}) | ||
tests = append(tests, &routingTestDefinition{configFile: "testdata/conf.routing.yml", alert: client.LabelSet{"test": "2"}, expectedReceivers: []string{"test1", "test2"}}) | ||
tests = append(tests, &routingTestDefinition{configFile: "testdata/conf.routing-reverted.yml", alert: client.LabelSet{"test": "2"}, expectedReceivers: []string{"test2", "test1"}}) | ||
tests = append(tests, &routingTestDefinition{configFile: "testdata/conf.routing.yml", alert: client.LabelSet{"test": "volovina"}, expectedReceivers: []string{"default"}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically these would all be defined in the slice instead of appending them, so
tests := []*routingTestDefinition{
&routingTestDefinition{configFile: "testdata/conf.routing.yml", alert: client.LabelSet{"test": "1"}, expectedReceivers: []string{"test1"}},
&routingTestDefinition{configFile: "testdata/conf.routing.yml", alert: client.LabelSet{"test": "2"}, expectedReceivers: []string{"test1", "test2"}},
}
cli/test_routing_test.go
Outdated
} | ||
fmt.Println(" OK") | ||
} | ||
fmt.Println("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this logging can be removed as well
cli/utils.go
Outdated
@@ -70,6 +73,57 @@ func parseMatchers(inputMatchers []string) ([]labels.Matcher, error) { | |||
return matchers, nil | |||
} | |||
|
|||
// GetRemoteAlertmanagerConfigStatus returns status responsecontaining configuration from remote Alertmanager | |||
func GetRemoteAlertmanagerConfigStatus(ctx context.Context, alertmanagerURL *url.URL) (*client.ServerStatus, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be lowercase, since it's not being used outside of this package. It can be exported later if the need arises.
cli/utils.go
Outdated
} | ||
|
||
// ConvertCientToCommonLabelSet converts client.LabelSet to model.Labelset | ||
func ConvertCientToCommonLabelSet(cls client.LabelSet) model.LabelSet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convertClientToCommonLabelSet
. There's a typo and I don't think this needs to be exported.
78e0ed2
to
f1f4e32
Compare
Signed-off-by: Martin Chodur <m.chodur@seznam.cz>
f1f4e32
to
ed4848f
Compare
Thanks once more for all the comments. |
Yay I'm glad you like this feature. |
Hi @stuartnelson3, we talked for a while at the PromCon about multi-tenancy of Alertmanager and increasing complexity of routing configuration.
So I thought some testing would be useful to avoid misconfiguration.
This PR adds
routing-test
command to theamtool
which accepts label definition of an alert and returns all receivers which the alert resolves to. It allows to load Alertmanager config from file or remotelyexample usage is
I also added flag
--verify.receivers
which also checks if resulting receivers match those specified and if not the return code is1
. I'm not sure if it should be implemented in the amtool but it makes testing easier.This should allow user to define tests for the routing rules.
Also you can run it against the new config and actual config using the remote config loading and display changes in routing introduced by the new updates.
I'd be glad for any comments or suggestions on this.