TaskRun retries create extra Pods #1976

imjasonh · 2020-01-29T15:42:54Z

Expected Behavior

Retrying a TaskRun N times within a PipelineRun should create at most N Pods to execute each attempt.

Actual Behavior

The TaskRunStatus reports the statuses of N retries, but in reality there are >N Pods created.

The e2e test added in #1975 reports this:

    retry_test.go:111: Found 7 Pods, want 5
    retry_test.go:115: BUG: TaskRunStatus.RetriesStatus did not report pod name "retry-pipeline-retry-me-2cpn8-pod-cjfnn"
    retry_test.go:118: BUG: Pod "retry-pipeline-retry-me-2cpn8-pod-cjfnn" is not failed: Running
    retry_test.go:115: BUG: TaskRunStatus.RetriesStatus did not report pod name "retry-pipeline-retry-me-2cpn8-pod-gpjf4"

The TaskRun was configured to retry 5 times, but created 7 pods (sometimes it's only 6), one of which was still Running at the time the test listed Pods.

Steps to Reproduce the Problem

go test -tags=e2e ./test -run=Retry and observe logs like the above. Change t.Log to t.Error to have the test fail and see a full K8s object dump.

The text was updated successfully, but these errors were encountered:

vdemeester · 2020-01-29T16:05:23Z

/kind bug

vincent-pli · 2020-02-04T07:03:37Z

@imjasonh
I try the case, sometimes the issue occurred.
The direct cause is in reconcile of taskrun: in retry case, after create new pod, at the end of reconcile it try to update the taskun but failed with

Failed to update taskRun status, the object has been modified; please apply your changes to the latest version and try again

Then the related taskrun stay in workqueue and reconcile again after increased delay, since podname is still null (update failed last time) then a new pod was created.

I have not figure out why status update error occurred but I think the reconcile logic should enhance to avoid this problem.

BTW, the test case is not correct, please check the PR.

bobcatfish · 2020-02-06T16:40:33Z

Fixed by #1996

ghost · 2020-02-06T16:51:11Z

Hey, sorry @bobcatfish that PR only updates the test to more accurately reflect the incorrectness described by the bug. The bug still exists :S

vincent-pli · 2020-02-13T05:44:26Z

This issue could be closed

vdemeester · 2020-02-13T07:16:17Z

Indeed #2022 fixed it
/close

tekton-robot · 2020-02-13T07:16:20Z

@vdemeester: Closing this issue.

In response to this:

Indeed #2022 fixed it
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tekton-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jan 29, 2020

vincent-pli mentioned this issue Feb 4, 2020

Correct the number of expected pods created in e2e test case of "retry" #1996

Merged

3 tasks

bobcatfish closed this as completed Feb 6, 2020

ghost reopened this Feb 6, 2020

This was referenced Feb 8, 2020

Enhance reconcile of taskrun to avoid extra pod creation #2022

Merged

Make testcase of "retry" raise error rather than just log #2033

Merged

tekton-robot closed this as completed Feb 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TaskRun retries create extra Pods #1976

TaskRun retries create extra Pods #1976

imjasonh commented Jan 29, 2020 •

edited

Loading

vdemeester commented Jan 29, 2020

vincent-pli commented Feb 4, 2020

bobcatfish commented Feb 6, 2020

ghost commented Feb 6, 2020

vincent-pli commented Feb 13, 2020

vdemeester commented Feb 13, 2020

tekton-robot commented Feb 13, 2020

TaskRun retries create extra Pods #1976

TaskRun retries create extra Pods #1976

Comments

imjasonh commented Jan 29, 2020 • edited Loading

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

vdemeester commented Jan 29, 2020

vincent-pli commented Feb 4, 2020

bobcatfish commented Feb 6, 2020

ghost commented Feb 6, 2020

vincent-pli commented Feb 13, 2020

vdemeester commented Feb 13, 2020

tekton-robot commented Feb 13, 2020

imjasonh commented Jan 29, 2020 •

edited

Loading