Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail fast on invalid image #6982

Merged
merged 1 commit into from
Jul 31, 2023
Merged

Conversation

afrittoli
Copy link
Member

Changes

The kubernets pod treats an invallid image failures as potentially ephemeral errors, because even if the format of the image reference is not syntactically correct, users may update image without recreating the Pod.

Tekton, however, uses Pod to provide workloads that run to completion, and users are not allowed to change the specification of steps during execution.

This commits changes the handling of the InvalidImageName pod reason, so that the TaskRun is marked as failed and the Pod deleted.

Fixes: #6105

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Has Docs if any changes are user facing, including updates to minimum requirements e.g. Kubernetes version bumps
  • Has Tests included if any functionality added or changed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including functionality, content, code)
  • Has a kind label. You can add one by adding a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
  • Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings). See some examples of good release notes.
  • Release notes contains the string "action required" if the change requires additional action from users switching to the new release

Release Notes

The Pod reason InvalidImageName is treated now as a permanent issue, so that TaskRuns that include a step with an invalid image reference are failed immediately and the corresponding Pod is deleted.

/kind bug

@tekton-robot tekton-robot added kind/bug Categorizes issue or PR as related to a bug. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 27, 2023
@afrittoli afrittoli added this to the Pipelines v0.51 milestone Jul 27, 2023
@tekton-robot tekton-robot requested review from jerop and wlynch July 27, 2023 12:27
@@ -2314,65 +2314,36 @@ status:
}
}

func TestReconcilePodFailuresStepImagePullFailed(t *testing.T) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I turned the next test into a matrix test, and the two cases in here are included in the next test now.

@afrittoli
Copy link
Member Author

@chmouel FYI

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/taskrun/taskrun.go 85.2% 85.3% 0.1

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/taskrun/taskrun.go 85.2% 85.3% 0.1

@afrittoli
Copy link
Member Author

Example TaskRun with the fix in:

➜ tkn tr describe step-script-svs9s
Name:              step-script-svs9s
Namespace:         default
Service Account:   default
Timeout:           1h0m0s
Labels:
 app.kubernetes.io/managed-by=tekton-pipelines

🌡️  Status

STARTED         DURATION    STATUS
3 minutes ago   42s         Failed(TaskRunImagePullFailed)

Message

The step "bash" in TaskRun "step-script-svs9s" failed to pull the image "". The pod errored with the message: "Failed to apply default image tag "ubuntu@1234:v45@12fres": couldn't parse image reference "ubuntu@1234:v45@12fres": invalid reference format."

🦶 Steps

 NAME                          STATUS
 ∙ bash                        TaskRunImagePullFailed

The kubernets pod treats an invallid image failures as potentially
ephemeral errors, because even if the format of the image reference
is not syntactically correct, users may update image without
recreating the Pod.

Tekton, however, uses Pod to provide workloads that run to completion,
and users are not allowed to change the specification of steps
during execution.

This commits changes the handling of the InvalidImageName pod reason,
so that the TaskRun is marked as failed and the Pod deleted.

Fixes: tektoncd#6105

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/taskrun/taskrun.go 85.2% 85.3% 0.1

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 27, 2023
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/taskrun/taskrun.go 85.2% 85.3% 0.1

Copy link
Member

@QuanZhang-William QuanZhang-William left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Changes look good to me

Copy link
Member

@JeromeJu JeromeJu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @afrittoli , very nit in the commit message for kubernetes

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JeromeJu, vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@afrittoli
Copy link
Member Author

@JeromeJu @vdemeester can I get an lgtm?

@JeromeJu
Copy link
Member

JeromeJu commented Jul 31, 2023

/lgtm
sorry Andrea, wanted to have more eyes on this earlier. This looks good to me.

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 31, 2023
@tekton-robot tekton-robot merged commit 09c7594 into tektoncd:main Jul 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TaskRun with InvalidImageName runs forever
5 participants