Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display init container error message #5645

Open
cugykw opened this issue Oct 17, 2022 · 10 comments
Open

Display init container error message #5645

cugykw opened this issue Oct 17, 2022 · 10 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@cugykw
Copy link
Contributor

cugykw commented Oct 17, 2022

Expected Behavior

When an error occurs in the init container in the pod associated with taskrun, the error message can be clearly displayed.

Actual Behavior

An error occurred in the init container injected by webhook, causing the pod to end. TaskRun
status shows a message build failed for unspecified reasons.

taskrun status:

status:
  completionTime: "2022-10-17T02:04:35Z"
  conditions:
  - lastTransitionTime: "2022-10-17T02:04:35Z"
    message: build failed for unspecified reasons.
    reason: Failed
    status: "False"
    type: Succeeded

pod status:

status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-10-17T02:04:30Z"
    message: 'containers with unready status: [xxx]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2022-10-17T02:04:30Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - image: xxx
    imageID: ""
    lastState: {}
    name: xxx
    ready: false
    restartCount: 0
    started: false
    state:
      waiting:
        reason: PodInitializing
  initContainerStatuses:
  - containerID: docker://xxx
    image: xxx
    imageID: docker-pullable://xxx
    lastState: {}
    name: xxx
    ready: false
    restartCount: 0
    state:
      terminated:
        containerID: docker://xxx
        exitCode: 1
        finishedAt: "2022-10-17T02:04:34Z"
        reason: Error
        startedAt: "2022-10-17T02:04:34Z"
  phase: Failed

Steps to Reproduce the Problem

1.Inject a failing init container into the pod associated with taskrun.

Additional Info

  • Kubernetes version:

kubernetes: 1.21.3

  • Tekton Pipeline version:

pipeline version: 0.37.3

@cugykw cugykw added the kind/bug Categorizes issue or PR as related to a bug. label Oct 17, 2022
@cugykw
Copy link
Contributor Author

cugykw commented Oct 17, 2022

/assign @cugykw

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 15, 2023
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 14, 2023
@tekton-robot
Copy link
Collaborator

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@AeroNotix
Copy link

I'm having this issue and I can't figure out how to resolve it, sometimes my jobs fail because something happened in the init container steps and the only output I get is "build failed for unspecified reasons".

Is this fixed in fewer issues or is there a workaround?

@afrittoli
Copy link
Member

@AeroNotix I'm sorry to hear you're experiencing this issue.
This issue was closed by our bot because it was old and had no activity on it for a long time.
Which version of Tekton are you using? If you're experiencing this feel free to reopen the issue and share more details if applicable.

@AeroNotix
Copy link

Using v0.33.2.

Aware this isn't the absolute latest but I want to avoid just blindly updating things unless I know it is actually going to improve any thing.

@afrittoli
Copy link
Member

Sure, I understand. Do you have a reproducer for the issue that could be tried on a more recent version, to check if the issue still exists? We would not be able to produce a fix for that version, but if the issue still exists, it could be fixed on a newer version. See which versions we support: https://github.com/tektoncd/pipeline/blob/main/releases.md

@afrittoli afrittoli reopened this Oct 7, 2024
@github-project-automation github-project-automation bot moved this from Done to In Progress in Tekton Community Roadmap Oct 7, 2024
@AeroNotix
Copy link

I don't particularly have a task spec that I can hand that will reproduce this but the job runs successfully 999 times out of 1000, except when we attempt to launch several tasks concurrently. Typically this task is ran on-demand by humans however we have automation which launches this same task with varying parameters without human intervention. For whatever reason it is only these tasks which fail, when attempting to launch the task again manually with the exact same parameters - it fails with the above error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
Status: In Progress
Development

No branches or pull requests

4 participants