Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added routing tools to amtool #1511

Merged
merged 1 commit into from
Aug 22, 2018

Conversation

FUSAKLA
Copy link
Contributor

@FUSAKLA FUSAKLA commented Aug 12, 2018

Hi @stuartnelson3, we talked for a while at the PromCon about multi-tenancy of Alertmanager and increasing complexity of routing configuration.

So I thought some testing would be useful to avoid misconfiguration.

This PR adds routing-test command to the amtool which accepts label definition of an alert and returns all receivers which the alert resolves to. It allows to load Alertmanager config from file or remotely

example usage is

$ ./amtool routing-test --config.file=doc/examples/simple.yml  service=database
team-DB-pager

I also added flag --verify.receivers which also checks if resulting receivers match those specified and if not the return code is 1. I'm not sure if it should be implemented in the amtool but it makes testing easier.

This should allow user to define tests for the routing rules.
Also you can run it against the new config and actual config using the remote config loading and display changes in routing introduced by the new updates.

I'd be glad for any comments or suggestions on this.

@roidelapluie
Copy link
Member

if we go that path, would a --debug option help too to see the routes taken by the labels?

@FUSAKLA
Copy link
Contributor Author

FUSAKLA commented Aug 12, 2018

Hmm that sounds reasonable, but the issue is routes does not have any names so I'm not sure how should the output look like.

@FUSAKLA
Copy link
Contributor Author

FUSAKLA commented Aug 12, 2018

also there is possibility to use something like https://github.com/xlab/treeprint
to also draw simple ASCII tree view. But that would be duplication of https://prometheus.io/webtools/alerting/routing-tree-editor/ functionality.

@stuartnelson3
Copy link
Contributor

Thanks for looking at this!

I think the treeprint suggestion would be pretty cool, personally. I think there's a huge benefit to having this, and being able to do it locally is something a lot of people would like. Although the tree routing visualizer on the website does everything in the browser and doesn't store or transmit any of the entered data, people are still cautious about copying and pasting their information into a website.

The functionality so far looks cool, but it would also be helpful to add something like --tree so that the route could be visualized as @roidelapluie said (or --debug, we can think about the naming)

@stuartnelson3
Copy link
Contributor

I'm also thinking where this should live .. maybe it should be under amtool config routing-tree

@FUSAKLA
Copy link
Contributor Author

FUSAKLA commented Aug 13, 2018

Ok, no problem. I'll be happy to add the tree visualization option.

I like the amtool config routing-tree naming, but I'm not sure how should the rest look like.

amtool config routing-tree test [--tree] lavel=value (test the alert routing, --tree would additionally print tree just for the specified alert)

and

amtool config routing-tree show (or without the show? this would print the tree of whole routing)

? Thanks for comments

@stuartnelson3
Copy link
Contributor

for a rough UI how about:

amtool config routes [show] # shows tree, default action with no subcommand.
amtool config routes test [--tree] label1=value1 label2=value2 labeln=valuen # see how alert creation implements this

We've been calling it a routing-tree for a while but I've been wondering about that name. Let's try just using routes as the subcommand and see how it feels for now, and see what others think about it.

@FUSAKLA FUSAKLA force-pushed the fus-add-routing-test branch from 3764668 to 977b35d Compare August 14, 2018 22:44
@FUSAKLA
Copy link
Contributor Author

FUSAKLA commented Aug 14, 2018

Ok I have a new version according to your suggestion @stuartnelson3

I added to vendor the https://github.com/xlab/treeprint library

The tree visualization could be better but the library is simple and works quite well.
But we can change it if you want.

Output looks like this:

$ ./amtool config routes --config.file test.yml 
Routing tree:
.
└── {}  continue: false  receiver: blackhole
    ├── {service="api"}  continue: true  receiver: slack
    │   ├── {severity="critical"}  continue: true  receiver: pagerduty
    │   └── {severity="low"}  continue: true  receiver: mattermost
    ├── {service="database"}  continue: true  receiver: pagerduty
    └── {service=~"^(?:.*)$"}  continue: true  receiver: email
        └── {service=~"^(?:.*)$"}  continue: true  receiver: email
            └── {service=~"^(?:.*)$"}  continue: false  receiver: email

The tree output in test mode omits continue and receivers of parent routes

$ ./amtool config routes test --tree --config.file test.yml service=api severity=critical 
Matching tree:
.
└── {}
    ├── {service="api"}
    │   └── {severity="critical"}  receiver: pagerduty
    └── {service=~"^(?:.*)$"}
        └── {service=~"^(?:.*)$"}
            └── {service=~"^(?:.*)$"}  receiver: email


pagerduty,email

@FUSAKLA FUSAKLA changed the title feat: added routing test to amtool feat: added routing tools to amtool Aug 14, 2018
@stuartnelson3
Copy link
Contributor

Cool, will take a look!

Copy link
Contributor

@stuartnelson3 stuartnelson3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is SUPER cool, thanks for all the effort!

I wrote a few comments in line.

One thing that sticks out to me is how the default route will always just be {}. Maybe we should make that into a special case and come up with something to print that indicates that this is the default, without the somewhat cryptic {} that might confuse people who are unfamiliar with routing.

cli/routing.go Outdated

func checkRoutingConfigInputFlags(alertmanagerURL *url.URL, configFile string) {
if alertmanagerURL != nil && configFile != "" {
kingpin.Fatalf("You can't use both --config.file and --alertmanager.url at the same time.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a config file stored locally which points to an alertmanager URL, which prevents me from looking at local changes via config file.

# ./amtool config routes --config.file=examples/ha/alertmanager.yml 
amtool: error: You can't use both --config.file and --alertmanager.url at the same time.

I'm not sure how to check for it, but if both flags are provided then it makes sense to either warn or bail, otherwise I think the --config.file flag should take precedence, since it can only be passed via command line.

Copy link
Contributor Author

@FUSAKLA FUSAKLA Aug 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch.. I didn't think of this case.

to assume precedence of --config.file seems reasonable but I'll take a look if it would be possible to check if the alertmanager.url flag was set from config, than I could still bail on using both flags and just warn on overriding the config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at it more and I thing it's not worth the complexity to check if the flag was set from the config file. I'd than need to check if the value of alertmanager.url changed from the default value and it would become hard to understand I think.

For now I'll stick to your suggestion and just print warning that --config.file overrides the alertmanager-url flag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

cli/routing.go Outdated
}

func convertRouteToTree(route *dispatch.Route, tree treeprint.Tree) {
routeDecription := fmt.Sprintf("%s continue: %t receiver: %s", route.Matchers.String(), route.Continue, route.RouteOpts.Receiver)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I run this against SoundCloud's config (which has many many routes), it ends up being fairly noisy. I'm willing to bet most people don't ever use continue, so how about we only mention continue if its value is set to true?

Also, I'm thinking about maybe only showing the receiver when a user uses show. When using test, the matches get written out at the bottom so maybe it's not necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the continue, I definitely agree that the output gets bit messy, but I thought it changes behavior of the matching so much that I'd better state it there.

On the other hand the default is false so we can omit the default value. Good point I'll suppress it.

About the test and showing a receiver in the tree. I think it doesn't hurt and it's easier to see from which branch comes which receiver (even though it matches all leafs top-down to receivers left-to-right). But if you want I can remove it.

Could you possibly share how the output looks like for such complex configuration? Just out of curiosity I was lazy to write more complex routing tree :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the test and showing a receiver in the tree. I think it doesn't hurt and it's easier to see from which branch comes which receiver (even though it matches all leafs top-down to receivers left-to-right). But if you want I can remove it.

We can leave it in for now and think about it a bit more.

Could you possibly share how the output looks like for such complex configuration? Just out of curiosity I was lazy to write more complex routing tree :)

It's a lot to sanitize, I'll figure out a way to get it to you. I'm on vacation for the next few days so it'll be early next week.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forget it if you'd need to sanitize it, I was just curious :)

cli/routing.go Outdated
}

func convertRouteToTree(route *dispatch.Route, tree treeprint.Tree) {
routeDecription := fmt.Sprintf("%s continue: %t receiver: %s", route.Matchers.String(), route.Continue, route.RouteOpts.Receiver)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was your motivation behind using two spaces between the struct members?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I wanted to separate it more. Should I shrink it to single spaces?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

playing around with it a bit. it's a lot of information so I understand wanting to give it some more space.

func printMatchingTree(mainRoute *dispatch.Route, ls client.LabelSet) {
tree := treeprint.New()
getMatchingTree(mainRoute, tree, ls)
fmt.Println("Matching tree:")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try Matching routes to stick with the subcommand name

@FUSAKLA FUSAKLA force-pushed the fus-add-routing-test branch from 977b35d to 78e0ed2 Compare August 19, 2018 14:14
@FUSAKLA
Copy link
Contributor Author

FUSAKLA commented Aug 19, 2018

Thanks for the review hopefully I covered all your comments.

The continue: true is printed only if set to true now

$ ./amtool config routes --config.file=doc/examples/simple.yml
Routing tree:
.
└── default-route  receiver: team-X-mails
    ├── {service=~"^(?:^(foo1|foo2|baz)$)$"}  receiver: team-X-mails
    │   └── {severity="critical"}  receiver: team-X-pager
    ├── {service="files"}  receiver: team-Y-mails
    │   └── {severity="critical"}  receiver: team-Y-pager
    └── {service="database"}  receiver: team-DB-pager
        ├── {owner="team-X"}  continue: true  receiver: team-X-pager
        └── {owner="team-Y"}  receiver: team-Y-pager

When both config.file and alertmanager.url are specified warning is printed out to stderr

$ ./amtool config routes --config.file=doc/examples/simple.yml --alertmanager.url=http://localhost:9090 1>/dev/null
Warning: --config.file flag overrides the --alertmanager.url.

I added also changelog entry and usage notes to the README.

Copy link
Contributor

@stuartnelson3 stuartnelson3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bunch of comments. Very close! I'm super excited about this PR.

One note: It looks like you updated vendor/vendor.json, but you still need to vendor the actual files.

cli/routing.go Outdated

Will print whole routing tree in form of ASCII tree view.

Routing is loaded from configuration file or remote Alertmanager.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Routing is loaded from a local configuration file or a running Alertmanager configuration.

cli/routing.go Outdated

Routing is loaded from configuration file or remote Alertmanager.
Use --config.file or --alertmanager.url flags to specify where to look for the configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifying --config.file takes precedence over --alertmanager.url

const routingTestHelp = `Test alert routing

Will return receiver names which the alert with given labels resolves to.
(If resolves to multiple receivers, they are printed out joined using ',' in order as defined in the routing tree)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parens () can be removed, and change If resolves to If the labelset resolves. I think joined using ',' can be removed, too.


Routing is loaded from configuration file or remote Alertmanager.
Use --config.file or --alertmanager.url flags to specify where to look for the configuration.
Than specify labels defining the alert.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the same help text for this configuration description as in routingHelp

func configureRoutingTestCmd(cc *kingpin.CmdClause, c *routingShow) {
var routingTestCmd = cc.Command("test", routingTestHelp)

routingTestCmd.Flag("verify.receivers", "Checks if specified receivers matches resolved receivers. (Returns exit status 1 if not)").StringVar(&c.expectedReceivers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the text in the parens can be changed to something like The command fails if the labelset does not route to the specified receivers.. I think you don't need to have the parens, also.

tests = append(tests, &routingTestDefinition{configFile: "testdata/conf.routing-reverted.yml", alert: client.LabelSet{"test": "2"}, expectedReceivers: []string{"test2", "test1"}})
tests = append(tests, &routingTestDefinition{configFile: "testdata/conf.routing.yml", alert: client.LabelSet{"test": "volovina"}, expectedReceivers: []string{"default"}})

fmt.Printf("Running routing-test testsuit:\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed

tests = append(tests, &routingTestDefinition{configFile: "testdata/conf.routing.yml", alert: client.LabelSet{"test": "1"}, expectedReceivers: []string{"test1"}})
tests = append(tests, &routingTestDefinition{configFile: "testdata/conf.routing.yml", alert: client.LabelSet{"test": "2"}, expectedReceivers: []string{"test1", "test2"}})
tests = append(tests, &routingTestDefinition{configFile: "testdata/conf.routing-reverted.yml", alert: client.LabelSet{"test": "2"}, expectedReceivers: []string{"test2", "test1"}})
tests = append(tests, &routingTestDefinition{configFile: "testdata/conf.routing.yml", alert: client.LabelSet{"test": "volovina"}, expectedReceivers: []string{"default"}})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically these would all be defined in the slice instead of appending them, so

tests := []*routingTestDefinition{
        &routingTestDefinition{configFile: "testdata/conf.routing.yml", alert: client.LabelSet{"test": "1"}, expectedReceivers: []string{"test1"}},
        &routingTestDefinition{configFile: "testdata/conf.routing.yml", alert: client.LabelSet{"test": "2"}, expectedReceivers: []string{"test1", "test2"}},
}

}
fmt.Println(" OK")
}
fmt.Println("")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this logging can be removed as well

cli/utils.go Outdated
@@ -70,6 +73,57 @@ func parseMatchers(inputMatchers []string) ([]labels.Matcher, error) {
return matchers, nil
}

// GetRemoteAlertmanagerConfigStatus returns status responsecontaining configuration from remote Alertmanager
func GetRemoteAlertmanagerConfigStatus(ctx context.Context, alertmanagerURL *url.URL) (*client.ServerStatus, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be lowercase, since it's not being used outside of this package. It can be exported later if the need arises.

cli/utils.go Outdated
}

// ConvertCientToCommonLabelSet converts client.LabelSet to model.Labelset
func ConvertCientToCommonLabelSet(cls client.LabelSet) model.LabelSet {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convertClientToCommonLabelSet. There's a typo and I don't think this needs to be exported.

@FUSAKLA FUSAKLA force-pushed the fus-add-routing-test branch from 78e0ed2 to f1f4e32 Compare August 22, 2018 07:43
Signed-off-by: Martin Chodur <m.chodur@seznam.cz>
@FUSAKLA FUSAKLA force-pushed the fus-add-routing-test branch from f1f4e32 to ed4848f Compare August 22, 2018 07:48
@FUSAKLA
Copy link
Contributor Author

FUSAKLA commented Aug 22, 2018

Thanks once more for all the comments.
I didn't notice the vendor, I had bad global .gitignore :/

@stuartnelson3 stuartnelson3 merged commit 5d222bc into prometheus:master Aug 22, 2018
@FUSAKLA
Copy link
Contributor Author

FUSAKLA commented Aug 22, 2018

Yay I'm glad you like this feature.
thanks for accepting it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants