Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Libvirt test e2e failing due to lack of node-role.kubernetes.io/worker label #1107

Closed
wainersm opened this issue Jun 21, 2023 · 6 comments · Fixed by #1152
Closed

Libvirt test e2e failing due to lack of node-role.kubernetes.io/worker label #1107

wainersm opened this issue Jun 21, 2023 · 6 comments · Fixed by #1152

Comments

@wainersm
Copy link
Member

Today I ran the latest (commit 79d309d) libvirt e2e test but it failed as:

<OMITTED>

Wait at least one worker be Ready
Wait system pods be running or completed

time="2023-06-21T15:37:22-03:00" level=info msg="Deploy the Cloud API Adaptor"
time="2023-06-21T15:37:22-03:00" level=info msg="Install the controller manager"
Wait for the cc-operator-controller-manager deployment be available
time="2023-06-21T15:38:29-03:00" level=info msg="Customize the overlay yaml file"
time="2023-06-21T15:38:32-03:00" level=info msg="Install the cloud-api-adaptor"
Wait for the cc-operator-daemon-install DaemonSet be available
F0621 15:39:32.565344   78821 env.go:369] Setup failure: timed out waiting for the condition
FAIL    github.com/confidential-containers/cloud-api-adaptor/test/e2e   795.612s
FAIL
make: *** [Makefile:91: test-e2e] Error 1

It turned out the cc-operator-daemon-install daemonSet didn't get available because the worker node hadn't the node-role.kubernetes.io/worker label:

$ kubectl get nodes
NAME                   STATUS   ROLES           AGE    VERSION
peer-pods-ctlplane-0   Ready    control-plane   129m   v1.27.3
peer-pods-worker-0     Ready    <none>          127m   v1.27.3

In order to solve that problem, it will be needed to apply the node label after the cluster is created (see https://github.com/confidential-containers/cloud-api-adaptor/blob/main/test/provisioner/provision_libvirt.go#L86).

The Azure implementation also apply the label to the node (see https://github.com/confidential-containers/cloud-api-adaptor/blob/main/test/provisioner/provision_azure.go#L308) however it uses kubectl. It would be nice if we could have a Go implementation instead, that new method could be hosted in https://github.com/confidential-containers/cloud-api-adaptor/blob/main/test/e2e/common.go

How to reproduce the problem:

$ git clone https://github.com/confidential-containers/cloud-api-adaptor
$ cd cloud-api-adaptor
$ TEST_PROVISION=yes CLOUD_PROVIDER=libvirt make test-e2e
@bpradipt
Copy link
Member

bpradipt commented Jun 23, 2023

Related issues
#780
confidential-containers/operator#195

@bookinabox
Copy link
Contributor

I'll take a look

@bookinabox
Copy link
Contributor

I'm still getting:

F0705 14:29:45.303677 52645 env.go:369] Setup failure: timed out waiting for the condition FAIL github.com/confidential-containers/cloud-api-adaptor/test/e2e 995.619s FAIL

after adding the worker label in createCluster. Unsure if this is an issue on my end, but will investigate further.

@wainersm
Copy link
Member Author

wainersm commented Jul 5, 2023

I'm still getting:

F0705 14:29:45.303677 52645 env.go:369] Setup failure: timed out waiting for the condition FAIL github.com/confidential-containers/cloud-api-adaptor/test/e2e 995.619s FAIL

after adding the worker label in createCluster. Unsure if this is an issue on my end, but will investigate further.

It is not in your end, I also get this error. Actually I've a fix that I am testing right now. I will have it posted today.

@wainersm
Copy link
Member Author

wainersm commented Jul 6, 2023

@bookinabox please have a look at #1147

@bookinabox
Copy link
Contributor

bookinabox commented Jul 6, 2023

I'm still getting:
F0705 14:29:45.303677 52645 env.go:369] Setup failure: timed out waiting for the condition FAIL github.com/confidential-containers/cloud-api-adaptor/test/e2e 995.619s FAIL
after adding the worker label in createCluster. Unsure if this is an issue on my end, but will investigate further.

It is not in your end, I also get this error. Actually I've a fix that I am testing right now. I will have it posted today.

I changed the timing and it seems to still time out. How did you diagnose the problem initially? Is there a more verbose logging I can try to dig through?

edit: looks like for me even setting the timeout to 10+ minutes still times out. I suspect I might have a different issue.

bookinabox pushed a commit to bookinabox/cloud-api-adaptor that referenced this issue Jul 18, 2023
libvirt e2e tests did not properly label worker nodes with the worker
label.

This creates a function in test/provisioner/common.go that adds the
worker label to the libvirt e2e createCluster.

Fixes: confidential-containers#1107

Signed-off-by: Derek Lee <derlee@redhat.com>
wainersm pushed a commit that referenced this issue Jul 26, 2023
libvirt e2e tests did not properly label worker nodes with the worker
label.

This creates a function in test/provisioner/common.go that adds the
worker label to the libvirt e2e createCluster.

Fixes: #1107

Signed-off-by: Derek Lee <derlee@redhat.com>
bpradipt pushed a commit to bpradipt/cloud-api-adaptor that referenced this issue Aug 12, 2023
libvirt e2e tests did not properly label worker nodes with the worker
label.

This creates a function in test/provisioner/common.go that adds the
worker label to the libvirt e2e createCluster.

Fixes: confidential-containers#1107

Signed-off-by: Derek Lee <derlee@redhat.com>
wainersm pushed a commit to wainersm/cc-cloud-api-adaptor that referenced this issue Sep 5, 2023
libvirt e2e tests did not properly label worker nodes with the worker
label.

This creates a function in test/provisioner/common.go that adds the
worker label to the libvirt e2e createCluster.

Fixes: confidential-containers#1107

Signed-off-by: Derek Lee <derlee@redhat.com>
lysliu pushed a commit to lysliu/cloud-api-adaptor that referenced this issue Nov 9, 2023
libvirt e2e tests did not properly label worker nodes with the worker
label.

This creates a function in test/provisioner/common.go that adds the
worker label to the libvirt e2e createCluster.

Fixes: confidential-containers#1107

Signed-off-by: Derek Lee <derlee@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants