Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The failure reason of wait container is confusing #982

Closed
dtaniwaki opened this issue Sep 5, 2018 · 1 comment
Closed

The failure reason of wait container is confusing #982

dtaniwaki opened this issue Sep 5, 2018 · 1 comment

Comments

@dtaniwaki
Copy link
Member

Is this a BUG REPORT or FEATURE REQUEST?:

BUG REPORT

What happened:

The failure reason of wait container failed to save outputs: verify serviceaccount default:default has necessary privileges is confusing. It says as if I tried to save outputs even if I don't define any outputs.
https://github.com/argoproj/argo/blob/master/workflow/controller/operator.go#L723

I took a while to solve this issue because I thought I had to explicitly define no output in the manifest.

What you expected to happen:

The failure reason should not be confusing.

How to reproduce it (as minimally and precisely as possible):

Run argo workflow without setting up permissions by kubectl create rolebinding default-admin --clusterrole=admin --serviceaccount=default:default.

Anything else we need to know?:

My minikube started with the following options.

$ minikube start --vm-driver=hyperkit -b kubeadm --kubernetes-version v1.11.1 --cpus 2 --memory 4096

Environment:

  • Argo version:
$ argo version
argo: v2.2.0
  BuildDate: 2018-08-30T08:51:40Z
  GitCommit: af636ddd8455660f307d835814d3112b90815dfd
  GitTreeState: clean
  GitTag: v2.2.0
  GoVersion: go1.10.3
  Compiler: gc
  Platform: darwin/amd64
  • Kubernetes version :
$ kubectl version -o yaml
clientVersion:
  buildDate: 2018-04-26T04:07:23Z
  compiler: gc
  gitCommit: 9e4010d067ee4d799330980e62830b06a07296bd
  gitTreeState: clean
  gitVersion: v0.0.0-master+$Format:%h$
  goVersion: go1.10.1
  major: ""
  minor: ""
  platform: darwin/amd64
serverVersion:
  buildDate: 2018-07-17T18:43:26Z
  compiler: gc
  gitCommit: b1b29978270dc22fecc592ac55d903350454310a
  gitTreeState: clean
  gitVersion: v1.11.1
  goVersion: go1.10.3
  major: "1"
  minor: "11"
  platform: linux/amd64

Other debugging information (if applicable):

  • workflow result:
$ argo get <workflowname>
Name:                argo-chainer-mnist-tmmsh
Namespace:           default
ServiceAccount:      default
Status:              Failed
Message:             child 'argo-chainer-mnist-tmmsh-1836238827' failed
Created:             Tue Sep 04 22:24:14 +0900 (12 hours ago)
Started:             Tue Sep 04 22:24:14 +0900 (12 hours ago)
Finished:            Tue Sep 04 22:31:28 +0900 (12 hours ago)
Duration:            7 minutes 14 seconds
Parameters:
  gpu:               -1
  items:             [
  { "name": "test-1", "batchsize": 50 },
  { "name": "test-2", "batchsize": 100 }
]


STEP                                                       PODNAME                              DURATION  MESSAGE
 ✖ argo-chainer-mnist-tmmsh                                                                               child 'argo-chainer-mnist-tmmsh-1836238827' failed
 ├-·-✔ download-dataset                                    argo-chainer-mnist-tmmsh-3057814570  2m
 | └-✔ download-script                                     argo-chainer-mnist-tmmsh-1282354929  7s
 ├-·-✔ training(0:batchsize:%!s(float64=50),name:test-1)   argo-chainer-mnist-tmmsh-4234717127  3m
 | └-✔ training(1:batchsize:%!s(float64=100),name:test-2)  argo-chainer-mnist-tmmsh-731144559   2m
 └---⚠ evaluation                                          argo-chainer-mnist-tmmsh-1836238827  17s       failed to save outputs: verify serviceaccount default:default has necessary privileges
  • executor logs:
$ kubectl logs <failedpodname> -c wait
time="2018-09-04T13:31:28Z" level=info msg="Creating a docker executor"
time="2018-09-04T13:31:28Z" level=info msg="Executor (version: v2.2.0, build_date: 2018-08-30T08:52:54Z) initialized with template:\narchiveLocation: {}\ninputs: {}\nmetadata:\n  labels:\n    app: argo-chainer-mnist\nname: evaluation\noutputs: {}\nscript:\n  command:\n  - python3\n  image: chainer/chainer:v4.4.0-python3\n  name: \"\"\n  resources: {}\n  source: |\n    import json, os, operator, sys\n    results = {}\n    d = \"/mnt/vol\"\n    for p in [p for p in os.listdir(d) if os.path.isdir(os.path.join(d, p))]:\n        with open(\"%s/%s/log\" % (d, p)) as fp:\n            array = json.load(fp)\n            results[p] = array[-1][\"main/accuracy\"]\n    name = max(results.items(), key=operator.itemgetter(1))[0]\n    print(\"%s performed the most!\" % name)\n    sys.exit(0)\n  volumeMounts:\n  - mountPath: /mnt/vol\n    name: workdir\n"
time="2018-09-04T13:31:28Z" level=info msg="Waiting on main container"
time="2018-09-04T13:31:28Z" level=warning msg="Failed to get pod 'argo-chainer-mnist-tmmsh-1836238827': pods \"argo-chainer-mnist-tmmsh-1836238827\" is forbidden: User \"system:serviceaccount:default:default\" cannot get pods in the namespace \"default\""
time="2018-09-04T13:31:28Z" level=info msg="No sidecars"
time="2018-09-04T13:31:28Z" level=info msg="No output artifacts"
time="2018-09-04T13:31:28Z" level=info msg="No output parameters"
time="2018-09-04T13:31:28Z" level=info msg="Capturing script output"
time="2018-09-04T13:31:28Z" level=warning msg="Failed to get pod 'argo-chainer-mnist-tmmsh-1836238827': pods \"argo-chainer-mnist-tmmsh-1836238827\" is forbidden: User \"system:serviceaccount:default:default\" cannot get pods in the namespace \"default\""
time="2018-09-04T13:31:28Z" level=info msg="Alloc=3248 TotalAlloc=9597 Sys=8774 NumGC=3 Goroutines=5"
time="2018-09-04T13:31:28Z" level=fatal msg="pods \"argo-chainer-mnist-tmmsh-1836238827\" is forbidden: User \"system:serviceaccount:default:default\" cannot get pods in the namespace \"default\"\ngithub.com/argoproj/argo/errors.Wrap\n\t/root/go/src/github.com/argoproj/argo/errors/errors.go:87\ngithub.com/argoproj/argo/errors.InternalWrapError\n\t/root/go/src/github.com/argoproj/argo/errors/errors.go:70\ngithub.com/argoproj/argo/workflow/executor.(*WorkflowExecutor).getPod\n\t/root/go/src/github.com/argoproj/argo/workflow/executor/executor.go:504\ngithub.com/argoproj/argo/workflow/executor.(*WorkflowExecutor).GetMainContainerStatus\n\t/root/go/src/github.com/argoproj/argo/workflow/executor/executor.go:546\ngithub.com/argoproj/argo/workflow/executor.(*WorkflowExecutor).GetMainContainerID\n\t/root/go/src/github.com/argoproj/argo/workflow/executor/executor.go:563\ngithub.com/argoproj/argo/workflow/executor.(*WorkflowExecutor).CaptureScriptResult\n\t/root/go/src/github.com/argoproj/argo/workflow/executor/executor.go:580\ngithub.com/argoproj/argo/cmd/argoexec/commands.waitContainer\n\t/root/go/src/github.com/argoproj/argo/cmd/argoexec/commands/wait.go:55\ngithub.com/argoproj/argo/cmd/argoexec/commands.glob..func4\n\t/root/go/src/github.com/argoproj/argo/cmd/argoexec/commands/wait.go:19\ngithub.com/argoproj/argo/vendor/github.com/spf13/cobra.(*Command).execute\n\t/root/go/src/github.com/argoproj/argo/vendor/github.com/spf13/cobra/command.go:766\ngithub.com/argoproj/argo/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/root/go/src/github.com/argoproj/argo/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/argoproj/argo/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/root/go/src/github.com/argoproj/argo/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/root/go/src/github.com/argoproj/argo/cmd/argoexec/main.go:15\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:198\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:2361"
@stale
Copy link

stale bot commented Jul 2, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jul 2, 2020
@stale stale bot closed this as completed Jul 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant