-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PreStop hook logic seems incorrect #4698
Comments
I have a suspicion that this PreStop script is ineffective since the IPv6 PR where the following change was introduced, that keeps endpoints in the service even if the corresponding Pod is no longer ready (which is the case when terminating) 8cf990a#diff-f8806c3bf97eb22be2e2915e4fbe06d5a1e85f335e026cdc46a6e96f4d63da59R57-R58 I think this change was somewhat accidental as I was exploring switching to DNS based discovery for Elasticsearch an idea we later gave up on due to issues with DNS resolution and re-use of Pod IPs but the The simplest fix could therefore be to just revert that change. |
Things are bit more complicated unfortunately. We are relying on unready Pods being published for node discovery from clients (aka sniffing) #3182 If we were to go back to only publishing ready Pods in the headless service we would break that capability. I am thinking we could have a simpler version of the pre-stop hook that just waits for |
Bug Report
What did you do?
When terminating ES Pod PreStop hook script
pre-stop-hook-script.sh
attempts to wait for thePOD_IP
to be removed from the list of Endpoints of the headless serviceI replicated what the PreStop hook did in an ubuntu pod in the same namespace with the following:
When modifying
PRE_STOP_MAX_WAIT_SECONDS
andPRE_STOP_ADDITIONAL_WAIT_SECONDS
you can observe that the IP remains in the Service untilPRE_STOP_MAX_WAIT_SECONDS
when the PreStop hook terminates unsuccessfully and Kubernetes continues with the Termination. This holds true even if you setPRE_STOP_MAX_WAIT_SECONDS
to a substantial number. I went as high as 180s which is more than enough time.What did you expect to see?
I would expect to see the POD_IP get removed from the output of
getent hosts $HEADLESS_SERVICE_NAME | grep $POD_IP
before the Pod is terminated.What did you see instead? Under which circumstances?
It appears (although documentation on headless services is sparse) that headless services do not remove the IP from the list of Endpoint IPs until after the Pod is terminated, not when it enters
Terminating
status. What this results in is the PreStop hook will simply waitPRE_STOP_MAX_WAIT_SECONDS
and then continue terminating, but the IP is still in service for some time afterwards. It seems like the logic here is not working as intended. I'm not really sure how this would work given this behavior of Headless ServicesEnvironment
ECK version:
1.3.0
Kubernetes information:
GKE 1.19
for each of them please give us the version you are using
The text was updated successfully, but these errors were encountered: