`timeouts` is not working as expected. #6137

VeereshAradhya · 2023-02-09T15:15:09Z

Expected Behavior

When provided timeouts for pipeline and task in timeout section of pipelinerun, both pipelinerun and taskrun timeout should get updated

Actual Behavior

When provided timeouts for pipeline and task in timeouts section of pipelinerun, only the pipelinerun timeout is updated and the taskrun gets created with default timeout

Steps to Reproduce the Problem

Create a task and pipeline
Create pipelierun with timeouts section defined

Additional Info

Kubernetes version:

Output of kubectl version:

 Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"archive", BuildDate:"2022-11-11T00:00:00Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
 Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.6+263df15", GitCommit:"eddac29feb4bb46b99fb570999324e582d761a66", GitTreeState:"clean", BuildDate:"2023-01-23T21:00:20Z", GoVersion:"go1.18.7", Compiler:"gc", Platform:"linux/amd64"}

Tekton Pipeline version:

Output of tkn version or kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'

Client version: 0.28.0
Pipeline version: v0.41.0
Triggers version: v0.22.0
Operator version: v0.63.0

[varadhya@fedora plumbing]$ oc get tektonconfig config -o yaml | grep timeout 
    default-timeout-minutes: 5
[varadhya@fedora plumbing]$ cat pr.yaml
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  generateName: test-timeout-run-
spec:
  timeouts:
    pipeline: 2h
    tasks: 1h
  pipelineRef:
    name: test-timeout
[varadhya@fedora plumbing]$ oc create -f pr.yaml
pipelinerun.tekton.dev/test-timeout-run-n64qf created
[varadhya@fedora plumbing]$ tkn pr describe -L
Name:              test-timeout-run-n64qf
Namespace:         test-timeout
Pipeline Ref:      test-timeout
Service Account:   pipeline
Labels:
 tekton.dev/pipeline=test-timeout

🌡️  Status

STARTED         DURATION   STATUS
6 seconds ago   ---        Running

⏱  Timeouts
 Pipeline:   2h0m0s
 Tasks:      1h0m0s

🗂  Taskruns

 NAME                                    TASK NAME      STARTED         DURATION   STATUS
 ∙ test-timeout-run-n64qf-example-task   example-task   6 seconds ago   ---        Running
[varadhya@fedora plumbing]$ tkn tr describe -L
Name:              test-timeout-run-n64qf-example-task
Namespace:         test-timeout
Task Ref:          example-task
Service Account:   pipeline
Timeout:           5m0s
Labels:
 app.kubernetes.io/managed-by=tekton-pipelines
 tekton.dev/memberOf=tasks
 tekton.dev/pipeline=test-timeout
 tekton.dev/pipelineRun=test-timeout-run-n64qf
 tekton.dev/pipelineTask=example-task
 tekton.dev/task=example-task
Annotations:
 pipeline.tekton.dev/release=e38d112

🌡️  Status

STARTED          DURATION    STATUS
25 seconds ago   ---         Running

⚓ Params

 NAME        VALUE
 ∙ appName   123

🦶 Steps

 NAME          STATUS
 ∙ unnamed-0   Running

The text was updated successfully, but these errors were encountered:

lbernick · 2023-02-09T15:34:03Z

thanks for filing @VeereshAradhya! The "tasks" timeout for a PipelineRun is for the cumulative time taken by all of its tasks. If you'd like to set a timeout for an individual task, you can use pipeline.spec.tasks[].timeout. I'm going to close out this issue but please feel free to reopen if you have follow-up questions.

VeereshAradhya · 2023-02-09T16:00:16Z

The pipeline that I used had only one task. In the below example, I have specified task timeout as 2hours (cumulative) so the taskrun should have timeout of 2hours right?

cat pr.yaml
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  generateName: timeout-pipeline-run-
spec:
  timeouts:
    pipeline: 3h
    tasks: 2h
  pipelineRef:
    name: timeout-pipeline
status: {}
[varadhya@192 plumbing]$ tkn pr describe -L
Name:              timeout-pipeline-run-hml2q
Namespace:         test-timeout
Pipeline Ref:      timeout-pipeline
Service Account:   pipeline
Labels:
 tekton.dev/pipeline=timeout-pipeline

🌡️  Status

STARTED          DURATION   STATUS
20 seconds ago   ---        Running

⏱  Timeouts
 Pipeline:   3h0m0s
 Tasks:      2h0m0s

🗂  Taskruns

 NAME                                        TASK NAME      STARTED          DURATION   STATUS
 ∙ timeout-pipeline-run-hml2q-example-task   example-task   20 seconds ago   ---        Running
[varadhya@192 plumbing]$ tkn tr describe -L
Name:              timeout-pipeline-run-hml2q-example-task
Namespace:         test-timeout
Task Ref:          example-task
Service Account:   pipeline
Timeout:           1h0m0s
Labels:
 app.kubernetes.io/managed-by=tekton-pipelines
 tekton.dev/memberOf=tasks
 tekton.dev/pipeline=timeout-pipeline
 tekton.dev/pipelineRun=timeout-pipeline-run-hml2q
 tekton.dev/pipelineTask=example-task
 tekton.dev/task=example-task
Annotations:
 pipeline.tekton.dev/release=e38d112

🌡️  Status

STARTED          DURATION    STATUS
26 seconds ago   ---         Running

⚓ Params

 NAME        VALUE
 ∙ appName   123

🦶 Steps

 NAME          STATUS
 ∙ unnamed-0   Running

VeereshAradhya · 2023-02-09T16:03:30Z

I tested the same thing on 0.37.5 and the behaviour is different. The taskrun timeout is 2h (1h59m59.990998163s)

$ cat pr.yaml
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  generateName: test-timeout-run-
spec:
  timeouts:
    pipeline: 3h
    tasks: 2h
  pipelineRef:
    name: test-timeout
[varadhya@192 plumbing]$ 
[varadhya@192 plumbing]$ tkn pr describe -L
Name:              test-timeout-run-2x9bd
Namespace:         test-timeout
Pipeline Ref:      test-timeout
Service Account:   pipeline
Labels:
 tekton.dev/pipeline=test-timeout

🌡️  Status

STARTED          DURATION   STATUS
21 seconds ago   ---        Running

⏱  Timeouts
 Pipeline:   3h0m0s
 Tasks:      2h0m0s

🗂  Taskruns

 NAME                                    TASK NAME      STARTED          DURATION   STATUS
 ∙ test-timeout-run-2x9bd-example-task   example-task   21 seconds ago   ---        Running
[varadhya@192 plumbing]$ 
[varadhya@192 plumbing]$ tkn tr describe -L
Name:              test-timeout-run-2x9bd-example-task
Namespace:         test-timeout
Task Ref:          example-task
Service Account:   pipeline
Timeout:           1h59m59.990998163s
Labels:
 app.kubernetes.io/managed-by=tekton-pipelines
 tekton.dev/memberOf=tasks
 tekton.dev/pipeline=test-timeout
 tekton.dev/pipelineRun=test-timeout-run-2x9bd
 tekton.dev/pipelineTask=example-task
 tekton.dev/task=example-task
Annotations:
 pipeline.tekton.dev/release=1759eab

🌡️  Status

STARTED          DURATION    STATUS
29 seconds ago   ---         Running

🦶 Steps

 NAME          STATUS
 ∙ unnamed-0   Running
[varadhya@192 plumbing]$ 
[varadhya@192 plumbing]$ 
[varadhya@192 plumbing]$ tkn version
Client version: 0.28.0
Pipeline version: v0.37.5
Triggers version: v0.20.2
Operator version: v0.60.1

VeereshAradhya · 2023-02-09T16:04:40Z

/reopen

tekton-robot · 2023-02-09T16:04:43Z

@VeereshAradhya: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

lbernick · 2023-02-09T18:32:10Z

Yes, we changed the behavior of timeouts between the versions you've listed here. We previously applied the time remaining from timeouts.tasks as the timeout to each taskrun. Now, the PipelineRun controller will cancel any running tasks after timeouts.tasks has elapsed. (This helped fix a race condition plus issues with retried taskruns.) The tasks timeout isn't applied to the individual taskrun timeout (even if there's only one) but the taskRuns will still be canceled after the tasks timeout has elapsed.

vdemeester · 2023-02-10T09:31:39Z

Oh this is actually confusing now.
When I naively look at timeouts.pipeline and timeouts.tasks, I really thing the one on tasks is for "each" tasks. I asked around a bit (friends, customers, …) and they all thought the same. In addition, what is the difference between timeouts.tasks and timeouts.pipelines now ? If timeout.tasks is "for the cumulative time taken by all of its tasks" and, a pipeline is made out of Tasks, what is the difference with timeouts.pipeline (which is for the whole pipeline, which is the "cumulation" of Tasks) ?

Yes, we changed the behavior of timeouts between the versions you've listed here.

Was there any deprecation or feature-flags here (like when we changed the when behavior) ? It is breaking some of our workloads (and possible our customers as well) because of that behavior change.

AlanGreene · 2023-02-10T09:59:20Z

timeouts.tasks is for non-finally tasks, timeouts.pipeline includes both regular tasks and finally tasks
https://tekton.dev/docs/pipelines/pipelineruns/#configuring-a-failure-timeout

The documentation could probably make this clearer that tasks is the cumulative timeout for all non-finally tasks, not each individual task.

lbernick · 2023-02-10T13:04:51Z

@vdemeester can you give more detail on what workloads were breaking due to this change? It was introduced in #5134

vdemeester · 2023-02-10T15:11:09Z

@lbernick On our downstream pipelines, we used to have a given amount for tasks to timeout (using timeouts.tasks) that was different from the default TaskRun timeout. With the upgrade, now the timeout is the default (which is lower than the one we used to have with timeout.tasks), and thus we have our pipeline failed because of this. It is definitely fixable in the Pipeline/PipelineRun definitions, it's just that it did break some existing behaviour on upgrade.

lbernick · 2023-02-10T21:23:30Z

Not sure I understand-- I would expect the pipelinerun controller to time out the tasks after timeouts.tasks elapses, even if that's shorter than the default taskrun timeout. Is there an example PipelineRun that passed prior to these changes that fails now?

vdemeester · 2023-02-14T16:50:18Z

Not sure I understand-- I would expect the pipelinerun controller to time out the tasks after timeouts.tasks elapses, even if that's shorter than the default taskrun timeout. Is there an example PipelineRun that passed prior to these changes that fails now?

This is what @VeereshAradhya wrote above to be fair. With timeouts.tasks = 2h, in 0.37.x it would set the TaskRun timeout to 2h, now it sets it to 1h (because it uses the default timeout for all tasks). This breaks because we relied on that timeouts.tasks to set that timeout for 2h for each TaskRun (well here it's only one but still). Now it changed 🙃

lbernick · 2023-02-14T17:00:36Z

Ah I see, sorry I didn't quite understand this before! I doubt we'd want to revert these changes; is there anything you'd suggest doing to mitigate the impact? We could definitely edit the release notes to make the impact more clear.

vdemeester · 2023-02-14T17:08:42Z

Ah I see, sorry I didn't quite understand this before! I doubt we'd want to revert these changes; is there anything you'd suggest doing to mitigate the impact?

Oh yeah, no really asking to revert the change (yet ? 😈). It's more "for next time we need to be more careful". The "naming" is still a bit confusing to me (timeouts.alltasks or something would have made more sense to me) but, I also don't think that's a huge deal.

We could definitely edit the release notes to make the impact more clear.

👍🏼

lbernick · 2023-02-15T14:12:05Z

@VeereshAradhya PTAL at #6171 and let me know if these updated docs are clearer!

lbernick · 2023-02-15T14:18:53Z

Also, release notes have been updated.

VeereshAradhya added the kind/bug Categorizes issue or PR as related to a bug. label Feb 9, 2023

VeereshAradhya changed the title ~~timeouts field not working as expected.~~ timeouts field is not working as expected. Feb 9, 2023

VeereshAradhya changed the title ~~timeouts field is not working as expected.~~ timeouts is not working as expected. Feb 9, 2023

lbernick closed this as completed Feb 9, 2023

tekton-robot reopened this Feb 9, 2023

EmmaMunley mentioned this issue Feb 9, 2023

Added Skip Logic: Matrix parameters cannot be empty arrays #6140

Merged

5 tasks

lbernick mentioned this issue Feb 15, 2023

Clarify docs on pipelinerun timeouts #6171

Merged

4 tasks

tekton-robot closed this as completed in #6171 Mar 1, 2023

lbernick mentioned this issue Apr 12, 2023

TaskRun timeout is always set to 1h0m0s after v0.40 #6520

Closed

vdemeester mentioned this issue Mar 22, 2024

Timeout not working on Pipeline run #7793

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`timeouts` is not working as expected. #6137

`timeouts` is not working as expected. #6137

VeereshAradhya commented Feb 9, 2023 •

edited

Loading

lbernick commented Feb 9, 2023

VeereshAradhya commented Feb 9, 2023

VeereshAradhya commented Feb 9, 2023 •

edited

Loading

VeereshAradhya commented Feb 9, 2023

tekton-robot commented Feb 9, 2023

lbernick commented Feb 9, 2023

vdemeester commented Feb 10, 2023

AlanGreene commented Feb 10, 2023

lbernick commented Feb 10, 2023

vdemeester commented Feb 10, 2023

lbernick commented Feb 10, 2023

vdemeester commented Feb 14, 2023

lbernick commented Feb 14, 2023

vdemeester commented Feb 14, 2023

lbernick commented Feb 15, 2023

lbernick commented Feb 15, 2023

timeouts is not working as expected. #6137

timeouts is not working as expected. #6137

Comments

VeereshAradhya commented Feb 9, 2023 • edited Loading

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Additional Info

lbernick commented Feb 9, 2023

VeereshAradhya commented Feb 9, 2023

VeereshAradhya commented Feb 9, 2023 • edited Loading

VeereshAradhya commented Feb 9, 2023

tekton-robot commented Feb 9, 2023

lbernick commented Feb 9, 2023

vdemeester commented Feb 10, 2023

AlanGreene commented Feb 10, 2023

lbernick commented Feb 10, 2023

vdemeester commented Feb 10, 2023

lbernick commented Feb 10, 2023

vdemeester commented Feb 14, 2023

lbernick commented Feb 14, 2023

vdemeester commented Feb 14, 2023

lbernick commented Feb 15, 2023

lbernick commented Feb 15, 2023

`timeouts` is not working as expected. #6137

`timeouts` is not working as expected. #6137

VeereshAradhya commented Feb 9, 2023 •

edited

Loading

VeereshAradhya commented Feb 9, 2023 •

edited

Loading