Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alert for failing nodes to join a cluster #3857

Open
njuettner opened this issue Jan 30, 2025 · 1 comment
Open

Alert for failing nodes to join a cluster #3857

njuettner opened this issue Jan 30, 2025 · 1 comment
Assignees

Comments

@njuettner
Copy link
Member

njuettner commented Jan 30, 2025

Follow up on #3828

In order to detect this behaviour, it would be good to track errors in bootstrap controller.

So if bootstrap token cannot be refreshed we would need an alert which counts logging entries, using a counter would be the easiest thing to do.

sum by (name) (count_over_time({app="cluster-api", pod=~"capi-kubeadm-bootstrap-controller-.*"} |= `failed to refresh bootstrap token`| logfmt[5m]))

Currently on hold because Atlas is dealing with the implementation

@github-project-automation github-project-automation bot moved this to Inbox 📥 in Roadmap Jan 30, 2025
@njuettner njuettner moved this from Inbox 📥 to Blocked / Waiting ⛔️ in Roadmap Jan 30, 2025
@njuettner njuettner self-assigned this Jan 30, 2025
@fiunchinho
Copy link
Member

Hey @njuettner . Isn't that issue already covered by this alert that we added back in November?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Blocked / Waiting ⛔️
Development

No branches or pull requests

2 participants