-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: do not fail TaskRun for concurrent modification errors #7467
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -280,6 +280,12 @@ func (c *Reconciler) stopSidecars(ctx context.Context, tr *v1.TaskRun) error { | |
// it has probably evicted. We can return the error, but we consider it a permanent one. | ||
return controller.NewPermanentError(err) | ||
} else if err != nil { | ||
// It is admissible for Pods to fail with concurrentModification errors | ||
// when stopping sideCars. Instead of failing the TaskRun, we shall just | ||
// let the reconciler requeue. | ||
if isConcurrentModificationError(err) { | ||
return controller.NewRequeueAfter(time.Second) | ||
} | ||
logger.Errorf("Error stopping sidecars for TaskRun %q: %v", tr.Name, err) | ||
tr.Status.MarkResourceFailed(v1.TaskRunReasonStopSidecarFailed, err) | ||
} | ||
|
@@ -1014,6 +1020,28 @@ func isResourceQuotaConflictError(err error) bool { | |
return k8ErrStatus.Details != nil && k8ErrStatus.Details.Kind == "resourcequotas" | ||
} | ||
|
||
const ( | ||
// TODO(#7466) Currently this appears as a local constant due to upstream dependencies bump blocker. | ||
// This shall reference to k8s.io/apiserver/pkg/registry/generic/registry.OptimisticLockErrorMsg | ||
// once #7464 is unblocked. | ||
optimisticLockErrorMsg = "the object has been modified; please apply your changes to the latest version and try again" | ||
) | ||
|
||
// isConcurrentModificationError determines whether it is a concurrent | ||
// modification error depending on its error type and error message. | ||
func isConcurrentModificationError(err error) bool { | ||
if !k8serrors.IsConflict(err) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reason why just checking k8serrors.IsConflict() isn't sufficient and we have to check the exact optimisticLockErrorMsg? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there are actual other cases where the NewConflict() errors are thrown i.e. there is an invalid storage error or uid mismatch where there could be a case of an object was missing unexpectedly. It might be safe to only retry on concurrent modifications for this case IIUC. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it, makes sense. |
||
return false | ||
} | ||
|
||
var se *k8serrors.StatusError | ||
if !errors.As(err, &se) { | ||
return false | ||
} | ||
|
||
return strings.Contains(err.Error(), optimisticLockErrorMsg) | ||
} | ||
|
||
// retryTaskRun archives taskRun.Status to taskRun.Status.RetriesStatus, and set | ||
// taskRun status to Unknown with Reason v1.TaskRunReasonToBeRetried. | ||
func retryTaskRun(tr *v1.TaskRun, message string) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
~is 7466 a pr?~
Oh I see, #7466 this one