Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker-build test task is failing #481

Closed
chmouel opened this issue Aug 10, 2020 · 13 comments
Closed

docker-build test task is failing #481

chmouel opened this issue Aug 10, 2020 · 13 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@chmouel
Copy link
Member

chmouel commented Aug 10, 2020

Expected Behavior

No failure :)

Actual Behavior

The test are sometime failing :

FAILED: docker-build task has failed to comeback properly

The main issue seems to be with this :

+ docker build --no-cache -f ./Dockerfile -t localhost:5000/nocode .
unable to resolve docker endpoint: open /certs/client/ca.pem: no such file or directory

Steps to Reproduce the Problem

  1. Check the log here from this issue

Additional Info

@chmouel
Copy link
Member Author

chmouel commented Aug 10, 2020

/kind bug

@tekton-robot tekton-robot added the kind/bug Categorizes issue or PR as related to a bug. label Aug 10, 2020
@chmouel
Copy link
Member Author

chmouel commented Aug 10, 2020

cc @popcor255 @PuneetPunamiya @imjasonh @vdemeester

(people in the owners file for docker-build task)

it is breaking the catalog CI fyi,

@imjasonh
Copy link
Member

This is unexpected, since the Task's sidecar shouldn't report as ready until the certs are generated and available:

readinessProbe:
periodSeconds: 1
exec:
command: ['ls', '/certs/client/ca.pem']

@popcor255
Copy link
Member

popcor255 commented Aug 10, 2020

I can not reproduce the error. @chmouel
Could you reproduce it?

@vinamra28
Copy link
Member

vinamra28 commented Aug 11, 2020

@popcor255 could you please clone #485 and then try reproducing the problem. For now I skipped the tests in the CI e2e.

@chmouel
Copy link
Member Author

chmouel commented Aug 11, 2020

This is weird, maybe there is some race going on, i can def reproduce it every time when running the test manually :

Manual run output:
         name: output
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
+ echo '--- Container Logs'
--- Container Logs
++ kubectl get pod -o name -n docker-build-0-1
+ for pod in $(kubectl get pod -o name -n ${tns})
+ kubectl logs --all-containers -n docker-build-0-1 pod/affinity-assistant-f5130924a6-0
Error from server (BadRequest): container "affinity-assistant" in pod "affinity-assistant-f5130924a6-0" is waiting to start: ContainerCreating
+ true
+ for pod in $(kubectl get pod -o name -n ${tns})
+ kubectl logs --all-containers -n docker-build-0-1 pod/docker-test-pipeline-run-docker-build-ckphn-pod-ws67h
+ docker build --no-cache -f ./Dockerfile -t localhost:5000/nocode .
unable to resolve docker endpoint: open /certs/client/ca.pem: no such file or directory
2020/08/11 13:33:08 Skipping step because a previous step failed
2020/08/11 13:33:14 Exiting...
+ for pod in $(kubectl get pod -o name -n ${tns})
+ kubectl logs --all-containers -n docker-build-0-1 pod/docker-test-pipeline-run-fetch-repository-9nsnz-pod-9b4jw
+ CHECKOUT_DIR=/workspace/output/
+ '[[' true '==' true ]]
+ cleandir
+ '[[' -d /workspace/output/ ]]
+ rm -rf /workspace/output//lost+found
+ rm -rf '/workspace/output//.[!.]*'
+ rm -rf '/workspace/output//..?*'
+ test -z
+ test -z
+ test -z
+ /ko-app/git-init -url https://github.com/popcor255/nocode -revision master -refspec  -path /workspace/output/ '-sslVerify=true' '-submodules=true' -depth 1
{"level":"info","ts":1597152766.533512,"caller":"git/git.go:139","msg":"Successfully cloned https://github.com/popcor255/nocode @ ef1a65b2f8e9a0cbf15c24c03ff8d202620512c5 (grafted, HEAD, origin/master) in path /workspace/output/"}
{"level":"info","ts":1597152766.5688412,"caller":"git/git.go:180","msg":"Successfully initialized and updated submodules in path /workspace/output/"}
+ cd /workspace/output/
+ git rev-parse HEAD
+ tr -d '\n'
+ RESULT_SHA=ef1a65b2f8e9a0cbf15c24c03ff8d202620512c5
+ EXIT_CODE=0
+ '[' 0 '!=' 0 ]
+ echo -n ef1a65b2f8e9a0cbf15c24c03ff8d202620512c5
+ exit 1
+ clean
+ rm -f /tmp/.mm.ZYnuZH

Can you try getting the latest catalog/master and run this on yoru cluster and see if that reproduces for you ?

% ./test/run-test.sh docker-build 0.1

it seems only to show up only randomly in CI

@piyush-garg
Copy link
Contributor

piyush-garg commented Aug 11, 2020

It just happened in Ci for me

unable to retrieve container logs for docker://ef392ddf1e7d9f858c50d4aebbaf77fe24e0edb4fbd8455c5eabc4dbf7ffbff4+ for pod in $(kubectl get pod -o name -n ${tns})
+ kubectl logs --all-containers -n docker-build-0-1 pod/docker-test-pipeline-run-docker-build-pcqrz-pod-n6h28
+ docker build --no-cache -f ./Dockerfile -t localhost:5000/nocode .
unable to resolve docker endpoint: open /certs/client/ca.pem: no such file or directory
2020/08/11 13:50:50 Skipping step because a previous step failed
time="2020-08-11T13:50:49.643974068Z" level=warning msg="No HTTP secret provided - generated random secret. This may cause problems with uploads if multiple registries are behind a load-balancer. To provide a shared secret, fill in http.secret in the configuration file or set the REGISTRY_HTTP_SECRET environment variable." go.version=go1.11.2 instance.id=cfa8d05d-3511-40c4-8a7c-d547472d67a9 service=registry version=v2.7.1 
time="2020-08-11T13:50:49.644166722Z" level=info msg="Starting upload purge in 51m0s" go.version=go1.11.2 instance.id=cfa8d05d-3511-40c4-8a7c-d547472d67a9 service=registry version=v2.7.1 
time="2020-08-11T13:50:49.644181322Z" level=info msg="redis not configured" go.version=go1.11.2 instance.id=cfa8d05d-3511-40c4-8a7c-d547472d67a9 service=registry version=v2.7.1 
time="2020-08-11T13:50:49.656257876Z" level=info msg="using inmemory blob descriptor cache" go.version=go1.11.2 instance.id=cfa8d05d-3511-40c4-8a7c-d547472d67a9 service=registry version=v2.7.1 
time="2020-08-11T13:50:49.656498312Z" level=info msg="listening on [::]:5000" go.version=go1.11.2 instance.id=cfa8d05d-3511-40c4-8a7c-d547472d67a9 service=registry version=v2.7.1

CI Job Run Link

@chmouel
Copy link
Member Author

chmouel commented Aug 11, 2020

we are going to disable the test in CI until we figure it out...

@piyush-garg
Copy link
Contributor

Raised PR #487 to disable in CI

@popcor255
Copy link
Member

popcor255 commented Aug 11, 2020

Hey @chmouel @piyush-garg @vinamra28. Thanks for getting back to me. I would have not been able to recreate the error without your help. 🎉 So, I did some debugging and I found something really interesting.

The difference from running the script manually vs the run-test.sh script:

error: a container name must be specified for pod docker-test-pipeline-run-docker-build-zrbgw-pod-gvbqv, choose one of: 
+ [step-docker-build step-docker-push sidecar-registry] 
- [step-docker-build step-docker-push sidecar-server]
 or one of the init containers: [place-scripts working-dir-initializer place-tools]

If you look closely at the containers sidecar-server is now sidecar-registry.
If you look at the pre-apply script in test it calls add_sidecar_registry.
Looks like the function add_sidecar_registry hijackes the sidecar.
My pre-apply script calls Line 31 of e2e-common.sh (prettier)

import yaml
f = open(0, encoding = "utf-8")
data = yaml.load(f.read(), Loader = yaml.FullLoader)
data["spec"]["sidecars"] = [{
  "image": "registry",
  "name": "registry"
}];
print(yaml.dump(data, default_flow_style = False))

This script just not add the sidecar. It overwrites the sidecar.
This one-liner can be adjusted to fix this bug. #491

- cat ${TMPF}.read | python3 -c 'import yaml;f=open(0, encoding="utf-8"); data=yaml.load(f.read(), Loader=yaml.FullLoader);data["spec"]["sidecars"]=[{"image":"registry", "name": "registry"}];print(yaml.dump(data, default_flow_style=False));' > ${TMPF}
+ cat ${TMPF}.read | python3 -c 'import yaml;f=open(0, encoding="utf-8"); data=yaml.load(f.read(), Loader=yaml.FullLoader);sidecars=data["spec"].get("sidecars"); sidecars = [[], data["spec"].get("sidecars")][sidecars != None]; sidecars.append({"image":"registry", "name": "registry"}); data["spec"]["sidecars"]=sidecars;print(yaml.dump(data, default_flow_style=False));' > ${TMPF}

As @imjasonh mentioned, the sidecar-server generates the cert.

This is unexpected, since the Task's sidecar shouldn't report as ready until the certs are generated and available:

readinessProbe:
periodSeconds: 1
exec:
command: ['ls', '/certs/client/ca.pem']

TLDR; dind-sidecar container is never ran with the test script.

popcor255 added a commit to popcor255/catalog that referenced this issue Aug 12, 2020
The end to end test attaches a sidecar with a image registry for tasks to push into to. This task is optional and is encouraged to be used for testing. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Allowing the sidecar-registry to run alongside with other tasks will fix this bug. tektoncd#481
popcor255 added a commit to popcor255/catalog that referenced this issue Aug 12, 2020
…ecar

The end to end test attaches a sidecar with a image registry for tasks to push into to. This task is optional and is encouraged to be used for testing. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Allowing the sidecar-registry to run alongside with other tasks will fix this bug. tektoncd#481
popcor255 added a commit to popcor255/catalog that referenced this issue Aug 12, 2020
…ecar

The end to end test attaches a sidecar with a image registry for tasks to push into to. This task is optional and is encouraged to be used for testing. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Allowing the sidecar-registry to run alongside with other tasks will fix this bug. tektoncd#481
popcor255 added a commit to popcor255/catalog that referenced this issue Aug 12, 2020
…ecar

The end to end test attaches a sidecar with a image registry for tasks to push into to. This task is optional and is encouraged to be used for testing. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Allowing the sidecar-registry to run alongside with other tasks will fix this bug. tektoncd#481
popcor255 added a commit to popcor255/catalog that referenced this issue Aug 12, 2020
…ecar

The end to end test attaches a sidecar with a image registry for tasks to push into to. This task is optional and is encouraged to be used for testing. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Allowing the sidecar-registry to run alongside with other tasks will fix this bug. Also, refactored the python script that manipulate json and yaml payloads in order to increase readability and improve local dev environment. tektoncd#481
popcor255 added a commit to popcor255/catalog that referenced this issue Aug 12, 2020
…ecar

The end to end test attaches a sidecar with a image registry for tasks to push into to. This task is optional and is encouraged to be used for testing. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Allowing the sidecar-registry to run alongside with other tasks will fix this bug. Also, refactored the python script that manipulate json and yaml payloads in order to increase readability and improve local dev environment. tektoncd#481
popcor255 added a commit to popcor255/catalog that referenced this issue Aug 12, 2020
…ecar

The end to end test attaches a sidecar with a image registry for tasks to push into to. This task is optional and is encouraged to be used for testing. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Allowing the sidecar-registry to run alongside with other tasks will fix this bug. Also, refactored the python script that manipulate json and yaml payloads in order to increase readability and improve local dev environment. tektoncd#481
@popcor255
Copy link
Member

/assign

@afrittoli
Copy link
Member

@vdemeester FYI
@popcor255 would you mind proposing a fix to avoid overriding existing sidecars?

popcor255 added a commit to popcor255/catalog that referenced this issue Aug 14, 2020
The end to end test attaches a sidecar with a image registry for tasks to push into to. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Removing the function from test to remove this bug. There is a local registry being added to the test that is deployed with deployment and svc. The svc is referenced during the test instead of the sidecar. tektoncd#481
popcor255 added a commit to popcor255/catalog that referenced this issue Aug 14, 2020
The end to end test attaches a sidecar with a image registry for tasks to push into to. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Removing the function from test to remove this bug. There is a local registry being added to the test that is deployed with deployment and svc. The svc is referenced during the test instead of the sidecar. tektoncd#481
popcor255 added a commit to popcor255/catalog that referenced this issue Aug 14, 2020
The end to end test attaches a sidecar with a image registry for tasks to push into to. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Removing the function from test to remove this bug. There is a local registry being added to the test that is deployed with deployment and svc. The svc is referenced during the test instead of the sidecar. tektoncd#481
@popcor255
Copy link
Member

popcor255 commented Aug 14, 2020

@afrittoli I wrote a pr to fix this bug specifically. There is some discussion having a one shared local repo. #491 However, there is gonna need some discussion on image-name/tags. Cause, if it's not consistent throughout all tasks, test will break.

popcor255 added a commit to popcor255/catalog that referenced this issue Aug 14, 2020
The end to end test attaches a sidecar with a image registry for tasks to push into to. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Removing the function from test to remove this bug. There is a local registry being added to the test that is deployed with deployment and svc. The svc is referenced during the test instead of the sidecar. tektoncd#481
popcor255 added a commit to popcor255/catalog that referenced this issue Aug 24, 2020
The end to end test attaches a sidecar with a image registry for tasks to push into to. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Removing the function from test to remove this bug. There is a local registry being added to the test that is deployed with deployment and svc. The svc is referenced during the test instead of the sidecar. tektoncd#481
popcor255 added a commit to popcor255/catalog that referenced this issue Aug 24, 2020
The end to end test attaches a sidecar with a image registry for tasks to push into to. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Removing the function from test to remove this bug. There is a local registry being added to the test that is deployed with deployment and svc. The svc is referenced during the test instead of the sidecar. tektoncd#481
popcor255 added a commit to popcor255/catalog that referenced this issue Aug 24, 2020
The end to end test attaches a sidecar with a image registry for tasks to push into to. However, the function add_sidecar_registry sets the sidecar registry instead of appending it. Removing the function from test to remove this bug. There is a local registry being added to the test that is deployed with deployment and svc. The svc is referenced during the test instead of the sidecar. tektoncd#481
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

7 participants