-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support local service in multiple standard load balancer mode #4450
feat: support local service in multiple standard load balancer mode #4450
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: nilo19 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test pull-cloud-provider-azure-e2e-ccm-vmss-multi-slb-capz |
ef79b2e
to
68fa2cd
Compare
/test pull-cloud-provider-azure-e2e-ccm-vmss-multi-slb-capz |
68fa2cd
to
ccb8c6e
Compare
/test pull-cloud-provider-azure-e2e-ccm-vmss-multi-slb-capz |
if strings.HasPrefix(strings.ToLower(pointer.StringDeref(bp.Name, "")), strings.ToLower(bpName)) { | ||
found = true | ||
} | ||
if len(*bp.LoadBalancerBackendAddresses) != expectedCount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean all bps will have the same amount of nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, only one bp should be created.
_, err = cs.AppsV1().Deployments(ns.Name).Update(context.Background(), deployment, metav1.UpdateOptions{}) | ||
Expect(err).NotTo(HaveOccurred()) | ||
err = wait.PollUntilContextTimeout(context.Background(), 5*time.Second, 5*time.Minute, false, func(ctx context.Context) (bool, error) { | ||
if err := checkNodeCountInBackendPoolByServiceIPs(tc, clusterName, expectedBPName, ips, len(nodes)); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so len(nodes) <= 5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be any number. I have a wrong log in L192.
ccb8c6e
to
8513bdf
Compare
@@ -264,6 +264,11 @@ type Config struct { | |||
|
|||
// DisableAPICallCache disables the cache for Azure API calls. It is for ARG support and not all resources will be disabled. | |||
DisableAPICallCache bool `json:"disableAPICallCache,omitempty" yaml:"disableAPICallCache,omitempty"` | |||
|
|||
// RouteUpdateIntervalInSeconds is the interval for updating routes. Default is 30 seconds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks making those intervals configurable.
@@ -247,10 +238,77 @@ func TestEnsureHostsInPoolNodeIP(t *testing.T) { | |||
}, | |||
}, | |||
}, | |||
{ | |||
desc: "local service", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you make the desc for those tests more clear? e.g. something like local service should xxxx.
} | ||
|
||
func (op *loadBalancerBackendPoolUpdateOperation) wait() batchOperationResult { | ||
return <-op.result | ||
return batchOperationResult{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still used somewhere? seems it should be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is used in route updater.
pkg/provider/azure_local_services.go
Outdated
"service name", op.serviceName, | ||
"load balancer name", op.loadBalancerName, | ||
"backend pool name", op.backendPoolName, | ||
"related IPs", strings.Join(op.nodeIPs, ",")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/related IPs/node IPs/
pkg/provider/azure_local_services.go
Outdated
"service name", op.serviceName, | ||
"load balancer name", op.loadBalancerName, | ||
"backend pool name", op.backendPoolName, | ||
"related IPs", strings.Join(op.nodeIPs, ",")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node IPs
pkg/provider/azure_local_services.go
Outdated
|
||
// getLocalServiceEndpointsNodeNames gets the node names that host all endpoints of the local service. | ||
func (az *Cloud) getLocalServiceEndpointsNodeNames(service *v1.Service) (sets.Set[string], error) { | ||
eps, err := az.KubeClient.DiscoveryV1().EndpointSlices(service.Namespace).List(context.Background(), metav1.ListOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we get this from local watch cache instead of a list API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will add a cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a cache, please check the latest commit.
BTW, there is a problem theoretically: if the endpointslice doesn't finish updating when we ensure hosts in pool, there could be a discrepancy between the # of node IPs in the pool and the actual state. This is because the endpoint slice update may be slower than the pods update. After testing multiple times, I haven't found this behavior but I think this could happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's a valid issue on large scale of workloads. Could we also trigger the backend pool updates on the endpointslice add/delete events?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Synced offline. There is no need to support backend pool updates for add/delete events. The latency issue will be fixed in a patch version of v1.28.
745c706
to
4e3536d
Compare
4e3536d
to
9f9d85c
Compare
|
||
// setUpEndpointSlicesInformer creates an informer for EndpointSlices of local services. | ||
// It watches the update events and send backend pool update operations to the batch updater. | ||
// TODO (niqi): the update of endpointslice may be slower than tue update of endpoint pods. Need to fix this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a todo
pkg/provider/azure_local_services.go
Outdated
var ep *discovery_v1.EndpointSlice | ||
az.endpointSlicesCache.Range(func(key, value interface{}) bool { | ||
endpointSlice := value.(*discovery_v1.EndpointSlice) | ||
if strings.EqualFold(getServiceNameOfEndpointSlice(endpointSlice), service.Name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to check service namespace also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching, please check the new commit.
/test pull-cloud-provider-azure-e2e-ccm-vmss-capz |
} else { | ||
key := strings.ToLower(getServiceName(service)) | ||
si, found := bi.getLocalServiceInfo(key) | ||
if found && !strings.EqualFold(si.lbName, lbName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if originally cluster was using single LB and a local service was created (so this map does not contain the service) and then cluster is updated to multiple LB (and service may need to be changed to another LB)? Want to make sure it still works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We reconcile all managed LB's backendpools in the outer caller of ensureHostsInPool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok if the map doesn't contain the service here, we just block the case where the lb name in the map is incorrect.
az.endpointSlicesCache.Store(strings.ToLower(fmt.Sprintf("%s/%s", newES.Namespace, newES.Name)), newES) | ||
|
||
key := strings.ToLower(fmt.Sprintf("%s/%s", newES.Namespace, svcName)) | ||
si, found := az.getLocalServiceInfo(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When user migrates from single LB to multiple LBs, a local svc may be not be found right? Will that cause any problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it won't, in this case the main loop will take over the process.
/lgtm |
/test pull-cloud-provider-azure-e2e-ccm-vmssflex-capz |
What type of PR is this?
/kind feature
What this PR does / why we need it:
In multiple slb mode, each local service owns a backend pool named after the service name. The backend pool will be created in the service reconciliation loop when the service is created or updated from external traffic policy cluster. It will be deleted in the service reconciliation loop when: 1, the service is deleted; 2, the service is changed to etp cluster; 3, the cluster is migrated from multi-slb to single-slb; and 4, the service is moved to another load balancer.
Besides the service reconciliation loop, an endpointslice informer is added in this pr. It watches all endpointslices of local services, monitors any updating events, and updates the corresponding backend pool. Considering local services may churn quickly, the informer will send backend pool updating operations to a buffer queue. The queue merges operations targeting to the same backend pool, and updates them every 30s.
Which issue(s) this PR fixes:
Fixes #
Related: #4013
Special notes for your reviewer:
Doc: #4451
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: