Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delete csi node didn't remove the csi plugin socket file #32

Closed
Colstuwjx opened this issue Sep 29, 2020 · 1 comment
Closed

delete csi node didn't remove the csi plugin socket file #32

Colstuwjx opened this issue Sep 29, 2020 · 1 comment

Comments

@Colstuwjx
Copy link
Contributor

After applied the csi-node.yaml, We've been setup the kubelet csi plugin under this path: /var/lib/kubelet/plugins/com.tencent.cloud.csi.cbs/csi.sock.

But I found if I typed kubectl delete -f csi-node.yaml, it wouldn't delete this socket file. We did have a pre-stop hook:

...
          lifecycle:
            preStop:
              exec:
                command: [
                  "/bin/sh", "-c",
                  "rm -rf /registration/com.tencent.cloud.csi.cbs \
                  /registration/com.tencent.cloud.csi.cbs-reg.sock"
                ]
...

and it wouldn't remove the csi plugin socket file, which deployed on /var/lib/kubelet/plugins/com.tencent.cloud.csi.cbs/csi.sock, and it mounted as /csi inside container, rather than /registration.

We may need to confirm this issue. BTW, while I deleted the csi node, I could see the error logs below:

I0928 18:02:13.538955    1848 handlers.go:62] Exec lifecycle hook ([/bin/sh -c rm -rf /registration/com.tencent.cloud.csi.cbs /registration/com.tencent.cloud.csi.cbs-reg.sock]) for Container "node-driver-registrar" in Pod "csi-tencentcloud-ht5td_kube-system(459b3f7b-a51d-42fc-b21f-e854ad278e13)" failed - error: command '/bin/sh -c rm -rf /registration/com.tencent.cloud.csi.cbs /registration/com.tencent.cloud.csi.cbs-reg.sock' exited with 126: , message: "OCI runtime exec failed: exec failed: container_linux.go:346: starting container process caused \"exec: \\\"/bin/sh\\\": stat /bin/sh: no such file or directory\": unknown\r\n"
W0928 20:25:03.036341    9114 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {/var/lib/kubelet/plugins/com.tencent.cloud.csi.cbs/csi.sock 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /var/lib/kubelet/plugins/com.tencent.cloud.csi.cbs/csi.sock: connect: no such file or directory". Reconnecting...
W0928 20:25:03.036491    9114 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {/var/lib/kubelet/plugins/com.tencent.cloud.csi.cbs/csi.sock 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /var/lib/kubelet/plugins/com.tencent.cloud.csi.cbs/csi.sock: connect: no such file or directory". Reconnecting...
W0928 20:25:03.036526    9114 asm_amd64.s:1337] Failed to dial /var/lib/kubelet/plugins/com.tencent.cloud.csi.cbs/csi.sock: context canceled; please retry.
E0928 18:34:35.285880    9361 goroutinemap.go:150] Operation for "/var/lib/kubelet/plugins/com.tencent.cloud.csi.cbs/csi.sock" failed. No retries permitted until 2020-09-28 18:36:37.285855605 +0800 CST m=+10675.294610487 (durationBeforeRetry 2m2s). Error: "RegisterPlugin error -- dial failed at socket /var/lib/kubelet/plugins/com.tencent.cloud.csi.cbs/csi.sock, err: failed to dial socket /var/lib/kubelet/plugins/com.tencent.cloud.csi.cbs/csi.sock, err: context deadline exceeded"

PTAL, thanks!

@Colstuwjx
Copy link
Contributor Author

ref: k-csi/node-driver-registrar issue 81, and there is already a fix PR: k-csi/node-driver-registrar pr 61.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants