Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BREAKING] - Rework probe #6

Merged
merged 14 commits into from
Dec 8, 2020
Merged

Conversation

ashangit
Copy link

@ashangit ashangit commented Dec 3, 2020

Change the probe so it does:

  • some durability checks (ensure documents are still available in a specific index)
  • search on that index
  • set/get/delete from a latency index
  • cat health on node to check nodes response

Change require:

  • one service account authorize to create/write on all clusters in durability and latency index only

  • one kibana tag set only on kibana cls service: maintenance-kibana

n.fraison added 10 commits December 1, 2020 16:36
Also querying cat doesn't perform any search so changing metrics naming
to cat latency instead of search latency
Kibana and ES have 2 different probes per cluster each being in charge of it's own set of
nodes to monitor
Watcher is only in charge of discovering/starting new probes for ES/Kibana clusters and remove them
Each probe cluster is then in charge of dicovering it's attached node and doing the appropriate monitoring
This will permits to add per cluster search and durability check
Currently each call on discovery leads to a consul client being initialised
This permitts to run more often the latency test and to base it from a rate per min
instead running it only once every 30s
@ashangit ashangit requested review from geobeau and jfwm2 December 3, 2020 18:47
@ashangit ashangit changed the title [WIP] - Rework probe Rework probe Dec 4, 2020
@ashangit ashangit changed the title Rework probe [BREAKING] - Rework probe Dec 4, 2020
Copy link

@geobeau geobeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to update the readme

log.Warning("Log level not recognized, fallback to default level (INFO)")
lvl = log.InfoLevel
}
log.SetLevel(lvl)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add log.SetFlags(log.LstdFlags | log.Lshortfile) as well? It will display the file of the line logged, super useful for debug

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't exist with github.com/sirupsen/logrus log lib

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:'( and it doesn't seem possible to do it nicely with logrus

if consulServices[serviceName][i] == consulTag {
// Check cluster not already added
cluster := valueFromTags("cluster_name", consulServices[serviceName])
// TODO ensure we use https when available?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO is outdated, we take the scheme (http/https) at L92

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not outdater we take first consul entry but first one can be the http definition then we don't override if we found the https one
Idea is to enforce usage of https

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@ashangit ashangit requested a review from geobeau December 8, 2020 08:34
log.Warning("Log level not recognized, fallback to default level (INFO)")
lvl = log.InfoLevel
}
log.SetLevel(lvl)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:'( and it doesn't seem possible to do it nicely with logrus

if consulServices[serviceName][i] == consulTag {
// Check cluster not already added
cluster := valueFromTags("cluster_name", consulServices[serviceName])
// TODO ensure we use https when available?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@ashangit ashangit merged commit 77e701e into criteo-forks:master Dec 8, 2020
@ashangit ashangit deleted the rework_probe branch December 8, 2020 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants