-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add liveness probe for calico-kube-controllers #4407
Add liveness probe for calico-kube-controllers #4407
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the PR! Added a comment and a couple questions.
- -r | ||
periodSeconds: 10 | ||
initialDelaySeconds: 10 | ||
failureThreshold: 6 | ||
readinessProbe: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intentional that this readinessProbe doesn't match the one above? (initialDelaySeconds vs periodSeconds).
Could you comment on why we should make these changes to the readinessprobe as well? The PR seems to mostly be about the liveness probe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, should be periodSeconds
. Will fix. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For periodSeconds
, the default value is 10
. Other probes in the same manifest have this set anyway and wanted to keep the same style.
exec: | ||
command: | ||
- /usr/bin/check-status | ||
- -r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking, the -r
flag was intended to mean "ready", and we intended to add a "-l" flag to mean "alive" - code is here: https://github.com/projectcalico/kube-controllers/blob/master/cmd/check-status/main.go#L37
I think we can probably get away with this as a quick fix, but should we want to have separate ready / live conditions we'll need to have separate flags, so longer term I'd definitely like to see that even if for now -l
== -r
.
Would you be interested in adding a new option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, didn't know this. Would not mind adding a new flag for liveness in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
910bdee
to
552e5b5
Compare
552e5b5
to
4eb06ed
Compare
CC @tmjd - you may be interested in this if we want to include in the operator as well. |
@caseydavenport yeah I think we would certainly want to add this to this operator. |
Description
Due to kubernetes/client-go#374,
calico-kube-controllers
takes a very long time to get ready when upgrading a cluster, as described in #3751.The underlying issue was fixed by projectcalico/libcalico-go#1356, but only for v3.18.
This change is a workaround that can be cherry-picked to v3.17 and earlier, but it's also a good practice for liveness probe to be added, same as for other calico components.
Related issues/PRs
https://github.com/projectcalico/libcalico-go/issues/1267
Todos
Release Note