-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup ResourceExport if exported Service has no available Endpoints #4056
Cleanup ResourceExport if exported Service has no available Endpoints #4056
Conversation
/test-multicluster-e2e |
Codecov Report
@@ Coverage Diff @@
## main #4056 +/- ##
==========================================
+ Coverage 61.26% 65.97% +4.70%
==========================================
Files 293 307 +14
Lines 43686 43961 +275
==========================================
+ Hits 26765 29003 +2238
+ Misses 14693 12630 -2063
- Partials 2228 2328 +100
*This pull request uses carry forward flags. Click here to find out more.
|
35beb4a
to
bafbd66
Compare
/test-multicluster-e2e |
multicluster/controllers/multicluster/serviceexport_controller.go
Outdated
Show resolved
Hide resolved
} | ||
return ctrl.Result{}, nil | ||
} | ||
|
||
// if corresponding Service doesn't exist, update ServiceExport's status reason to not_found_service, | ||
// If corresponding Service doesn't exist, update ServiceExport's status reason to not_found_service, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not_found_service -> service_not_found?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the corresponding Service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
multicluster/controllers/multicluster/serviceexport_controller.go
Outdated
Show resolved
Hide resolved
multicluster/controllers/multicluster/serviceexport_controller.go
Outdated
Show resolved
Hide resolved
@@ -313,15 +348,18 @@ func (r *ServiceExportReconciler) updateSvcExportStatus(ctx context.Context, req | |||
now := metav1.Now() | |||
var res, message *string | |||
switch cause { | |||
case notFound: | |||
case serviceNotFound: | |||
res = getStringPointer("not_found_service") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should change to "service_not_found".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
res = getStringPointer("not_found_service") | ||
message = getStringPointer("the Service does not exist") | ||
case importedService: | ||
case serviceWithoutEndpoints: | ||
res = getStringPointer("no_endpoints_service") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"no_endpoints" or "service_without_endpoints"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to "service_without_endpoints"
multicluster/controllers/multicluster/serviceexport_controller.go
Outdated
Show resolved
Hide resolved
@@ -138,15 +141,15 @@ func (r *ServiceExportReconciler) Reconcile(ctx context.Context, req ctrl.Reques | |||
epResExportName := getResourceExportName(r.localClusterID, req, "endpoints") | |||
|
|||
cleanup := func() error { | |||
err = r.handleServiceDeleteEvent(ctx, req, commonArea) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some comments to explain why we do not check svcInstalled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added
I leave it to caller to check the svcInstalled. In some rare cases, it might be better to call clean up anyway.
@@ -172,29 +177,54 @@ func (r *ServiceExportReconciler) Reconcile(ctx context.Context, req ctrl.Reques | |||
if err := cleanup(); err != nil { | |||
return ctrl.Result{}, err | |||
} | |||
err = r.updateSvcExportStatus(ctx, req, notFound) | |||
err = r.updateSvcExportStatus(ctx, req, serviceNotFound) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can update Status repeatedly if there are other Service/Endpoint changes happen, even they do not impact the Status value?
Maybe not a big deal, or we should save Status to installedSvcs? The same apply to cleanup() - should we keep the Service in installedSvcs, as long it has been processed, so we do not call cleanup() multiple times unnecessarily?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left this part unchanged considering there is no harm to update the status since we already skip watching ServiceExport's status change event, but I added a ResourceVersionChangedPredicate
for Service and Endpoint to limit the mapping events. It will only get notified when Service or Endpoints resource version changes.
When the ServiceExport is not found, it will check the 'svcInstalled' before call cleanup, but for Service and Endpoint, I left it to call cleanup()
anyway considering most events will be skipped if there is no existing ServiceExport.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no big "harm", but it is unnecessary and in some cases may look strange that we keep updating Status for no reason or keep deleting non-existing resources (will it generate strange logs by kube-apiserver or our code?).
Anyway, at least we need not to fix it in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a step to compare status difference before update the ServiceExport, so it can help to avoid unnecessary update.
bafbd66
to
78cfd11
Compare
/test-multicluster-e2e |
@@ -138,15 +141,17 @@ func (r *ServiceExportReconciler) Reconcile(ctx context.Context, req ctrl.Reques | |||
epResExportName := getResourceExportName(r.localClusterID, req, "endpoints") | |||
|
|||
cleanup := func() error { | |||
// There is no side effect to call deletion directly, so leave it to the caller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not the reason. The reason is when controller restarts, the Service is not in cache, but it is still possible we need to remove ResourceExports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
@@ -172,29 +177,54 @@ func (r *ServiceExportReconciler) Reconcile(ctx context.Context, req ctrl.Reques | |||
if err := cleanup(); err != nil { | |||
return ctrl.Result{}, err | |||
} | |||
err = r.updateSvcExportStatus(ctx, req, notFound) | |||
err = r.updateSvcExportStatus(ctx, req, serviceNotFound) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no big "harm", but it is unnecessary and in some cases may look strange that we keep updating Status for no reason or keep deleting non-existing resources (will it generate strange logs by kube-apiserver or our code?).
Anyway, at least we need not to fix it in this PR.
78cfd11
to
9b27aa1
Compare
|
||
if existingCondition != (k8smcsv1alpha1.ServiceExportCondition{}) { | ||
if *existingCondition.Reason == *newCondition.Reason { | ||
// No need to update the ServiceExport when there is no status change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As existingCondition is from cached client, this check may not be safe? But maybe not a big deal for status update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reminds me why we need to get and check existing ResourceExport in updateOrCreateResourceExport()? It may get stale state from client cache too right? We should know it is update or not, from installedSvcs? Just in the restart case, we do not know as installedSvcs is empty, but we can check if ResourceExport exits or not, only for the Create case to cover that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new updated ResourceExport will be set to use the same ResourceVersion
from existing ResourceExport, if the cache is out of date, controller will fail to update the ResourceExport and retry.
We check the stored installed Service info to compare the information to determine if it needs to update or just skip updating ResourceExport. If controller restart, the update action will happen once without comparing due to emtpy installedSvcs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw ResourceVersion
, but still I am asking is it necessary to get ResourceExport and compare, or in most cases we can just decide based on installedSvcs, but only when installedSvcs does not contain the Service we get ResourceExport?
If the Service is saved in installedSvcs, we know for sure an update is required, right? In that case, can we just update ResourceExport without get it first (and we will not get update failure due to stale cache/ResourceVersion)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We indeed compare installedSvcs instead of ResourceExport, the ResourceVersion is required when there is an update. It means we either store it in a cache or get it before update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel saving ResourceVersion to avoid possible update failure sounds better, but I am ok to keep the current approach too.
Refine ServiceExport controller to watch Endpoints events to ensure that Service kind of ResourceExport can be removed when the exported Service no longer has available Endpoints, or skip writing ResourceExport if there is no Endpoints at the beginning. Signed-off-by: Lan Luo <luola@vmware.com>
9b27aa1
to
2fd6842
Compare
/test-multicluster-e2e |
/test-integration |
/test-integration |
Refine ServiceExport controller to watch Endpoints events to
ensure that Service kind of ResourceExport can be removed
when the exported Service no longer has available Endpoints,
or skip writing ResourceExport if there is no Endpoints at the beginning.
Fixes #4055
Signed-off-by: Lan Luo luola@vmware.com