Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dockershim.sock symlink should be relative #4074

Open
Nuru opened this issue Jun 26, 2024 · 4 comments
Open

dockershim.sock symlink should be relative #4074

Nuru opened this issue Jun 26, 2024 · 4 comments
Assignees
Labels
area/kubernetes K8s including EKS, EKS-A, and including VMW type/bug Something isn't working

Comments

@Nuru
Copy link

Nuru commented Jun 26, 2024

Image I'm using:

Bottlerocket OS 1.20.2 (aws-k8s-1.29)

What I expected to happen:

I expected /run/dockershim.sock to be a valid socket.

What actually happened:

In the Datadog Agent Pod, they mount the host filesystem under /host. They then expect to be able to connect to the Docker daemon via /host/run/dockershim.sock. Unfortunately, /run/dockershim.sock is an absolute link to /run/containerd/containerd.sock (See #2173), which is broken in the mounted file system.

Proposed Solution:

Make /run/dockershim.sock a relative link to ./containerd/containerd.sock instead of an absolute link.

Note that /var/run/dockershim.sock is already a relative link: ./containerd/containerd.sock

How to reproduce the problem:

Deploy Datadog Helm chart 3.66.0 to EKS running Bottlerocket and configure according to Datadog docs with

criSocketPath: /run/dockershim.sock

View logs from DaemonSet datadog Pod, container agent, and see

CORE | ERROR | (pkg/util/containerd/containerd_util.go:109 in NewContainerdUtil) | Containerd init error: temporary failure in containerdutil, will retry later: failed to dial "/host/run/dockershim.sock": context deadline exceeded

Alternately, use kubectl exec into the agent container to run file /host/run/dockershim.sock and see the error:

/host/run/dockershim.sock: broken symbolic link to /run/containerd/containerd.sock
@Nuru Nuru added status/needs-triage Pending triage or re-evaluation type/bug Something isn't working labels Jun 26, 2024
@yeazelm
Copy link
Contributor

yeazelm commented Jun 27, 2024

Thanks for cutting this @Nuru. Do you know if this worked in a previous version of the helm chart? I noticed that they made a recent change DataDog/helm-charts#1352 but probably didn't impact this. Nonetheless, I think making this link relative should work. I'll give this a shot to see if it helps and report back!

@Nuru
Copy link
Author

Nuru commented Jun 27, 2024

Do you know if this worked in a previous version of the helm chart?

This setting is not in the Datadog Helm chart, it is in their documentation. The relevant part of their Helm chart has not changed in 3 years.

@yeazelm
Copy link
Contributor

yeazelm commented Jun 27, 2024

I was able to try out a change that does fix the symlink issue. I don't have a working Datadog setup to confirm that this fully fixes it but I can confirm the link works now:

# file /host/run/dockershim.sock
/host/run/dockershim.sock: symbolic link to ./containerd/containerd.sock

And the nodes with this relative link don't have the error message:

CORE | ERROR | (pkg/util/containerd/containerd_util.go:109 in NewContainerdUtil) | Containerd init error: temporary failure in containerdutil, will retry later: failed to dial "/host/run/dockershim.sock": context deadline exceeded

I'll get a PR cut shortly with this proposed fix.

@yeazelm
Copy link
Contributor

yeazelm commented Jun 27, 2024

bottlerocket-os/bottlerocket-core-kit#18 Should hopefully fix this issue when released!

@yeazelm yeazelm self-assigned this Jun 27, 2024
@yeazelm yeazelm added area/kubernetes K8s including EKS, EKS-A, and including VMW and removed status/needs-triage Pending triage or re-evaluation labels Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubernetes K8s including EKS, EKS-A, and including VMW type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants