-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
don't consider ANY pod that is not terminating #1957
Conversation
The neg syncer was doing some wrong assumptions: 1. Endpoints that are not Ready can have Pods with deletionTimestamp not nil. The Endpoint and EndpointSlice controller will remove the Pod from the corresponding object once it gets deleted (DeletionTimestamp != nil) 2. Any pod with deletionTimestamp == nil is valid, this is the most dangerous assumption, since a pod can be Unready and be used.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
// Only consider endpoints that are Ready. | ||
// TODO: In order to consider Terminating endpoints, the AddressData type should add a new field to | ||
// carry this information based on the EndpointSlices Conditions Terminating = true and Serving = true. | ||
// Terminating endpoints should only be used with ExternalTrafficPolicy = Local and if there are any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the L7 case we don't consider Local or Cluster as we always just program the pod endpoint. It also means there is no backup if the traffic is routed to that pod and the pod is removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to complete the existing one, I don't feel it fits too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one nit on the comment because I don't think it completely applies to this part of the code.
The only issue with this is that Endpoints do not have the terminating status on the endpoint itself, which is why this logic was added. I will be creating a PR to remove the entire endpoints support, which would remove this code as well. However until we do that, I do not think we should remove this due to correctness of the Endpoints path. I will try to have that PR ready by the end of the week, so we can merge this one in too.
this logic is wrong, or I'm reading it wrong, the terminating status means pod.deletionTimestsamp != nil, and this logic does not consider those pods, the existing logic may try to do that, but is not doing it In addition, it doesn't check the phase of the pods, that means that can be adding notReady or failing pods |
This is a new EndpointData that Alan added to consolidate slices and Endpoints, we can add Terminating status there if we need it, but besides that, this is not handling terminating ... but I'd love if you can double and triple check my assumptions |
/hold talked offline, @swetharepakula has to land different changes around this code and she'll fix this during that refactor |
This has been added as part of #1973. |
The neg syncer was doing some wrong assumptions:
Endpoints that are not Ready can have Pods with deletionTimestamp not nil.
The Endpoint and EndpointSlice controller will remove the Pod from the corresponding object once it gets deleted (DeletionTimestamp != nil)
Any pod with deletionTimestamp == nil is valid, this is the most dangerous assumption, since a pod can be Unready and be used.
/kind bug
/assign @swetharepakula @bowei