Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubernetes ignores /root/.docker/config.json #45487

Closed
jeroenjacobs79 opened this issue May 8, 2017 · 33 comments
Closed

kubernetes ignores /root/.docker/config.json #45487

jeroenjacobs79 opened this issue May 8, 2017 · 33 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@jeroenjacobs79
Copy link

I have docker setup with a private registry, the credentials are stored in /root/.docker/config.json.

Pulling images manually with "docker pull" works just fine, no issues there.

However, when images are pulled from Kubernetes it complains it's unable to authenticate.

Judging from the documentation here (https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/), this should work as my setup is the first one that is mentioned in the doc.

If any more steps are necessary, please make it clear in the docs. Talking to private registries is a must in any real enterprise-deployment.

Running Kubernetes 1.6.2 on CentOS7, configured via kubeadm, btw...

@jeroenjacobs79
Copy link
Author

Some logs:

May  8 14:04:08 build01-k8s-001 dockerd-current: time="2017-05-08T14:04:08.803436042+02:00" level=error msg="Handler for GET /v1.24/images/ubuntu:latest/json returned error: No such image: ubuntu:latest"
May  8 14:04:08 build01-k8s-001 dockerd-current: time="2017-05-08T14:04:08.806226268+02:00" level=info msg="{Action=create, LoginUID=4294967295, PID=721}"
May  8 14:04:08 build01-k8s-001 dockerd-current: time="2017-05-08T14:04:08.897475377+02:00" level=error msg="Attempting next endpoint for pull after error: Get https://nexus-docker.mydomain.be/v2/ubuntu/manifests/latest: no basic auth credentials"
May  8 14:04:09 build01-k8s-001 kubelet: E0508 14:04:09.089424     721 remote_image.go:102] PullImage "ubuntu:latest" from image service failed: rpc error: code = 2 desc = unauthorized: authentication required
May  8 14:04:09 build01-k8s-001 kubelet: E0508 14:04:09.091168     721 kuberuntime_image.go:50] Pull image "ubuntu:latest" failed: rpc error: code = 2 desc = unauthorized: authentication required
May  8 14:04:09 build01-k8s-001 kubelet: E0508 14:04:09.091252     721 kuberuntime_manager.go:719] container start failed: ErrImagePull: rpc error: code = 2 desc = unauthorized: authentication required
May  8 14:04:09 build01-k8s-001 dockerd-current: time="2017-05-08T14:04:09.088683280+02:00" level=error msg="Not continuing with pull after error: unauthorized: authentication required"
May  8 14:04:09 build01-k8s-001 kubelet: W0508 14:04:09.099724     721 kuberuntime_container.go:150] Non-root verification doesn't support non-numeric user (jenkins)

What is that "Non-root verification doesn't support non-numeric user (jenkins)" stuff ??

@jeroenjacobs79
Copy link
Author

Okay, This is madness... The documentation suggests that this file should be in $HOME/.docker/config.json. Since docker runs as root, I assume this means /root/.docker/config.json

Then I stumbled on this issue: #12835. This suggests /var/lib/kubelet as the parent directory, so I assumes this means /var/lib/kubelet/.docker/config.json

However, it seems the kubelet setup by kubeadm simply has root (/) as the working dir, so I tried to put it in /.docker/config.json as well..

Restarted kubelet multiple times, still no luck.

@krousey krousey added the sig/node Categorizes an issue or PR as relevant to SIG Node. label May 8, 2017
@jeroenjacobs79
Copy link
Author

Okay, I troubleshooted some more, and I think part of the problem is my particular setup. Still, I would like to know this behaviour is expected.

Some background info: I'm using Nexus as my docker registry which is also configured as proxy for the official docker repository.

On my K8s node (running on Centos7), I have configured Docker like this:

# If you want to add your own registry to be used for docker search and docker
# pull use the ADD_REGISTRY option to list a set of registries, each prepended
# with --add-registry flag. The first registry added will be the first registry
# searched.
# ADD_REGISTRY='--add-registry registry.access.redhat.com'
ADD_REGISTRY='--add-registry nexus-docker.mydomain.be'

# If you want to block registries from being used, uncomment the BLOCK_REGISTRY
# option and give it a set of registries, each prepended with --block-registry
# flag. For example adding docker.io will stop users from downloading images
# from docker.io
BLOCK_REGISTRY='--block-registry docker.io'

This blocks the official repo, and forces docker to use the nexus proxy. Credentials are configured, and this works correctly for "docker pull" commands. Example: when I do "docker pull ubuntu:latest", I see it connects to my nexus proxy and pulls the image, using the credentials supplied in the config.json file.

However, when I specify "ubuntu:latest" in my yml for kubelet, it connects to nexus-docker.mydomain.be but the pull fails and it complains about authentication being required. When I specify "nexus-docker.mydomain.be/ubuntu:latest" kubelet pulls the image just fine and uses the configured docker credentials.

So it seems kubelet connects to the correct registry, but just ignores the credentials when no server is specified in the image-name.

@djmckinley
Copy link

I am having the same issue with Kubernetes 1.6.2. Previously using 1.5.1 and everything worked with the credentials in /root/.docker/config.json. I was able to get it to work with Kubernetes 1.6.2 only by setting the deprecated flag when starting Kubelet: --enable-cri=false. So, somehow, the new CRI method of reading images is different - but I can find no documentation saying how to load static pods from a private registry with CRI enabled.

@djmckinley
Copy link

djmckinley commented May 11, 2017

To be clear, in my case, the private registry server is specified in the image name, but the problem is the same as described by jeroenjacobs1205 - which is that the credentials in /root/.docker/config.json do not seem to be used properly when CRI is enabled.

@yujuhong
Copy link
Contributor

So it seems kubelet connects to the correct registry, but just ignores the credentials when no server is specified in the image-name.

@jeroenjacobs1205 kubelet looks up the credentials before calling docker to pull images. Most likely (i.e., I did not verify) things went down this way:

  • kubelet looks up credentials for ubuntu:latest
  • Since no registry is specified, the lookup returns empty.
  • kubelet calls docker to pull images without credentials.
  • docker tries to pull the image from nexus-docker.mydomain.be due to the configuration.
  • The pull request fails because no credentials are given.

If you don't specify the registry in the image string and configure docker to behave differently, kubelet will be able to get the right credential to use.

@yujuhong
Copy link
Contributor

To be clear, in my case, the private registry server is specified in the image name, but the problem is the same as described by jeroenjacobs1205 - which is that the credentials in /root/.docker/config.json do not seem to be used properly when CRI is enabled.

Both the CRI and non-CRI implementations use the exact same code (pkg/credentialprovider) to get the credentials, and that code was not modified in 1.6. Could you try again and see whether disabling CRI could solve the problem? Just wanted to make sure that this can be consistently reproduced.

@djmckinley
Copy link

djmckinley commented May 12, 2017

Yes it is very reproducible. Here are more details:

I removed the "--enable-cri=false" on the kubelet command line, and it once again failed - with these log messages:

May 12 02:00:06 ost-djm1-master4qre kubelet[3949]: E0512 02:00:06.178501    3949 remote_runtime.go:86] RunPodSandbox from runtime service failed: rpc error: code = 2 desc = unable to pull sandbox image "cgbudockerdev2.us.oracle.com:7344/ext/pause-amd64:3.0": unauthorized: authentication required
May 12 02:00:06 ost-djm1-master4qre kubelet[3949]: E0512 02:00:06.178578    3949 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "kube-monitor-apiserver-ost-djm1-master4qre_kube-system(5cbf6e8c80d637d4e5f4f484f51a9761)" failed: rpc error: code = 2 desc = unable to pull sandbox image "cgbudockerdev2.us.oracle.com:7344/ext/pause-amd64:3.0": unauthorized: authentication required
May 12 02:00:06 ost-djm1-master4qre kubelet[3949]: E0512 02:00:06.178594    3949 kuberuntime_manager.go:619] createPodSandbox for pod "kube-monitor-apiserver-ost-djm1-master4qre_kube-system(5cbf6e8c80d637d4e5f4f484f51a9761)" failed: rpc error: code = 2 desc = unable to pull sandbox image "cgbudockerdev2.us.oracle.com:7344/ext/pause-amd64:3.0": unauthorized: authentication required
May 12 02:00:06 ost-djm1-master4qre kubelet[3949]: E0512 02:00:06.178701    3949 pod_workers.go:182] Error syncing pod 5cbf6e8c80d637d4e5f4f484f51a9761 ("kube-monitor-apiserver-ost-djm1-master4qre_kube-system(5cbf6e8c80d637d4e5f4f484f51a9761)"), skipping: failed to "CreatePodSandbox" for "kube-monitor-apiserver-ost-djm1-master4qre_kube-system(5cbf6e8c80d637d4e5f4f484f51a9761)" with CreatePodSandboxError: "CreatePodSandbox for pod \"kube-monitor-apiserver-ost-djm1-master4qre_kube-system(5cbf6e8c80d637d4e5f4f484f51a9761)\" failed: rpc error: code = 2 desc = unable to pull sandbox image \"cgbudockerdev2.us.oracle.com:7344/ext/pause-amd64:3.0\": unauthorized: authentication required"

The kubelet was itself running via a systemd unit:

[Unit]
Description=Kubelet service
Requires=docker.service
After=docker.service
[Service]
Environment=HOME=/root
ExecStartPre=/usr/local/bin/create-certs
ExecStartPre=/usr/local/bin/install-kube-binaries
ExecStart=/usr/local/bin/kubelet --pod-manifest-path=/etc/kubernetes/manifests --kubeconfig=/var/lib/kubernetes/kubeconfig --require-kubeconfig --allow-privileged=true --pod-infra-container-image=cgbudockerdev2.us.oracle.com:7344/ext/pause-amd64:3.0 --cloud-provider= --cluster-dns=172.31.53.53 --cluster-domain=occloud --node-status-update-frequency=10s --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --v=0
Restart=always
[Install]
WantedBy=multi-user.target

I edited this to change the ExecStart statement to (no other chanages):

ExecStart=/usr/local/bin/kubelet --enable-cri=false --pod-manifest-path=/etc/kubernetes/manifests --kubeconfig=/var/lib/kubernetes/kubeconfig --require-kubeconfig --allow-privileged=true --pod-infra-container-image=cgbudockerdev2.us.oracle.com:7344/ext/pause-amd64:3.0 --cloud-provider= --cluster-dns=172.31.53.53 --cluster-domain=occloud --node-status-update-frequency=10s --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --v=0

and did "systemctl daemon-reload; systemctl restart kubelet".

After that, it then loaded the static pods in /etc/kubernetes/manifests, and brought everything up.

I've switched back and forth on the --enable-cri=false option several times now. It always fails without it (i.e., CRI enabled by default), and always succeeds when it is included, to disable CRI.

The specific static pod it was trying to load first was defined like this in /etc/kubernetes/manifests/kube-monitor-apiserver.manifest:

apiVersion: v1
kind: Pod
metadata:
  name: kube-monitor-apiserver
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-support
    image: cgbudockerdev2.us.oracle.com:7344/dmckinle/kube-support:1.3
    command:
    - /bin/bash
    - -c
    - /kube-support/kube-monitor-apiserver AD1
    securityContext:
      privileged: true
    volumeMounts:
    - name: kubeconfig
      mountPath: /var/lib/kubernetes/kubeconfig
      readOnly: true
    - name: kubecerts
      mountPath: /var/run/kubernetes
      readOnly: true
    - name: bin
      mountPath: /usr/local/bin
      readOnly: true
  volumes:
  - name: kubeconfig
    hostPath:
      path: /var/lib/kubernetes/kubeconfig
  - name: kubecerts
    hostPath:
      path: /var/run/kubernetes
  - name: bin
    hostPath:
      path: /usr/local/bin

Note the HOME=/root environment variable set in the systemd unit shown above. The /root/.docker/config.json file is:

{
        "auths": {
                "dockerdev2.us.oracle.com:7344": {
                        "auth": "bXNwYXJjaDpjbG91ZHkz"
                }
        }
}

This is obviously correct credentials, because it works as soon as I disable CRI.

@jeroenjacobs79
Copy link
Author

@yujuhong

I think my Docker is configured correctly. When I pull the image from the commandline (eg: "docker pull ubuntu:latest") this works fine, and the credentials are being used.

So it only happens when Kubelet pulls the image.

@yujuhong
Copy link
Contributor

I think my Docker is configured correctly. When I pull the image from the commandline (eg: "docker pull ubuntu:latest") this works fine, and the credentials are being used.

@jeroenjacobs1205, I understand that. My #45487 (comment) explained why that's the case.
When you use kubelet to pull the image, kubelet looks up the credential by itself first, without the knowledge that you configured docker to use a different registry. It naively gets the credential for the wrong registry, and pass it to docker.

@yujuhong
Copy link
Contributor

@djmckinley, your case is different. The image you tried to pull is a pod sandbox image (or previously known as pod infra container image). With CRI, this image is considered an implementation detail of the runtime and we do not reuse the credential package kubelet uses. I'll file an issue to support reading docker config in the CRI implementation.

Pulling images for user containers should work though.

@djmckinley
Copy link

@yujuhong - it works if I pre-pull just the "pause" pod sandbox image from the registry prior to starting kubelet, confirming your diagnosis. Thanks for opening #45738 to resolve this issue the right way.

@yujuhong yujuhong added this to the v1.7 milestone May 23, 2017
@yujuhong yujuhong added the kind/bug Categorizes issue or PR as related to a bug. label May 23, 2017
@runcom
Copy link
Contributor

runcom commented May 25, 2017

I know how to fix this as it's specific to the Docker package we build for RHEL/Fedora/CentOS. It just requires a Docker API call to gather additional registries and resolve the correct credentials for the unqualified image. Please feel free to assign this to me.

@yujuhong
Copy link
Contributor

I know how to fix this as it's specific to the Docker package we build for RHEL/Fedora/CentOS. It just requires a Docker API call to gather additional registries and resolve the correct credentials for the unqualified image. Please feel free to assign this to me.

@runcom this is a general regression that affects all platforms, why do you think it's specific to the docker package you built for RHEL/Fedora/CentOS?

What API call would you use to gather the credentials? AFAIK, the docker configs are read by the docker CLI, and not maintained by the API.

@runcom
Copy link
Contributor

runcom commented May 25, 2017

@yujuhong how has this worked before? let me elaborate.

On upstream docker, if you pull an image with an unqualified image you always pull from the Docker Hub, like:

# this always hit Docker Hub!
$ docker pull ubuntu

On RHEL/Fedora/CentOS you have the ability to add additional registries when docker pulls images. Let me shows you an example:

# docker has been started with
# --add-registry=mydomain.com:5000 --add-registry=another.net:8080
docker pull ubuntu
# the action above, will:
#   1. try docker pull mydomain.com:5000/ubuntu
#   2. if the above fails, try with another.net:8080/ubuntu
#   3. if that also fails, fall back to docker.io/ubuntu (same behavior as upstream)

In the issue in question, if your pod defines an unqualified image (e.g. ubuntu) and you have --add-registry=mydomain.com:5000 and mydomain.com:5000 requires authentication (thus, you're meant to pull mydomain.com:5000/ubuntu!!!), the current keyring can't resolve authentication for mydomain.com:5000 because it was asked to lookup just ubuntu which tries to lookup auth for the docker.io repo (failing of course, and trying to pull unauthenticated).

What instead the keyring should do is to gather the []--add-registry array from the docker daemon (this functionality is only in RHEL/Fedora/CentOS package) and try to qualify the unqualified ubuntu image with the first default registry returned by the docker daemon, thus trying to resolve the correct auth.

Therefore, @yujuhong I can't see how this has ever worked 😕
I just opened this #46466 TOTALLY WIP BUT WORKS FINE to better explain, with code, what I mean :)

@yujuhong yujuhong removed this from the v1.7 milestone May 25, 2017
@yujuhong
Copy link
Contributor

The regression reported by @djmckinley is handled in #45738.

Removing the milestone and leaving this for the original issue (reported by @jeroenjacobs1205). I believe we've never supported that, so marking this a feature request instead.

@yujuhong yujuhong added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/bug Categorizes issue or PR as related to a bug. labels May 25, 2017
@alindeman
Copy link
Contributor

alindeman commented Jun 29, 2017

I dug into this a bit and the issue for me was that the HOME environment variable was empty when kubelet was launched through a systemd unit. While it's not documented this way, loading the configuration from /root/docker/config.json or /root/.dockercfg requires that HOME=/root

Setting User=root in the [Service] declaration fixed it up for me.

@jolestar
Copy link
Contributor

jolestar commented Aug 10, 2017

@alindeman 's suggestion is right, or create a link

ln -s /root/.docker /var/lib/kubelet/.docker

@himikof
Copy link

himikof commented Aug 14, 2017

Actually, reading the credential provider code, it seems that kubelet searches for the following files:

  • .dockercfg: ${rootDirectory}/.dockercfg, $PWD/.dockercfg, $HOME/.dockercfg, /.dockercfg
  • .docker/config.json: ${rootDirectory}/config.json, $PWD/config.json, $HOME/.docker/config.json, /.docker/config.json (note the missing .docker directory)

rootDirectory is usually /var/lib/kubelet. Additionally, when using hyperkube, only /var/lib/kubelet location is mounted from the host from these. The current situation seems confusing and is likely unintended.

k8s-github-robot pushed a commit to kubernetes/kops that referenced this issue Aug 28, 2017
…stemd_unit

Automatic merge from submit-queue

Fixes reading /root/.docker/config.json on debian

Debian and probably others apparently don't automatically default to using the root account if it's not specified.

ref: kubernetes/kubernetes#45487 (comment)
@ksanghavi
Copy link

ksanghavi commented Oct 18, 2017

From the logs. It searches in below path for docker config.json/.dockercfg. I'm running K8s v 1.7

Oct 18 18:51:08  kubelet: I1018 18:51:08.423296   17239 config.go:131] looking for config.json at /var/lib/kubelet/config.json
Oct 18 18:51:08  kubelet: I1018 18:51:08.423324   17239 config.go:131] looking for config.json at /config.json
Oct 18 18:51:08  kubelet: I1018 18:51:08.423332   17239 config.go:131] looking for config.json at /.docker/config.json
Oct 18 18:51:08  kubelet: I1018 18:51:08.423337   17239 config.go:131] looking for config.json at /.docker/config.json
Oct 18 18:51:08  kubelet: I1018 18:51:08.423345   17239 config.go:101] looking for .dockercfg at /var/lib/kubelet/.dockercfg
Oct 18 18:51:08  kubelet: I1018 18:51:08.423353   17239 config.go:101] looking for .dockercfg at /.dockercfg
Oct 18 18:51:08 kubelet: I1018 18:51:08.423359   17239 config.go:101] looking for .dockercfg at /.dockercfg
Oct 18 18:51:08  kubelet: I1018 18:51:08.423364   17239 config.go:101] looking for .dockercfg at /.dockercfg

@ieugen
Copy link

ieugen commented Nov 9, 2017

I'm adding my feedback here as it might help.
In my /root/.docker/config.json file I had two authentications. Each for another private registry (hosted by nexus, one simple registry for push and one group that proxied docker.io )

When I created regsecret as per instructions [1], deployment did not work.

I finally managed to fix it by removing the registry credentials with docker logout registry.name for the registry I was not using.

I believe kubelet does not know how to work with multiple credentials in docker.

[1] https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/

@zcalusic
Copy link

zcalusic commented Jan 2, 2018

Same issue here on Kubernetes v1.9.0 built with Kubeadm. Spent a lot of time trying to figure out why it's not working. I tried all examples from the official documentation, setting up /root/.docker/config.json properly, using Kubernetes secrets... all to no avail.

Thanks to @alindeman comment above, I was able to fix it by adding kubelet systemd snippet, reloading systemd configuration and restarting kubelet on all nodes. Now it finally works as intended, if docker pull can fetch the image, Kubernetes can too.

This is a serious issue which will bite lots of people, it's a shame this takes so long to fix.

@dims
Copy link
Member

dims commented Jan 2, 2018

@zcalusic fyi this was also reported in #57427 and #57273

It was fixed in #57463 and cherry picked to 1.9 branch in #57472

1.9.1 is expected today 5 PM Pacific time per:
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/kubernetes-dev/GAB8dzapP2c/J95j51MuCAAJ

Thanks,
Dims

@zcalusic
Copy link

zcalusic commented Jan 2, 2018

Ah, so it's actually combination of 2 bugs that bit me.

Thanks for the info @dims, looking forward to new release. 👍

@liggitt
Copy link
Member

liggitt commented Jan 4, 2018

fixed in 1.9.1, now released - https://github.com/kubernetes/kubernetes/releases/tag/v1.9.1

@liggitt liggitt closed this as completed Jan 4, 2018
@relaxdiego
Copy link
Contributor

Came across this with standalone kubelet (hyperkube) v1.8.7 and the fix was to mount /root onto the container using the docker run option --volume=/root:/root:ro. After restarting kubelet, it was able to download from the private registries that are in /root/.docker/config.json

@milanfarkas
Copy link

milanfarkas commented Mar 13, 2018

I'm on v1.9.4, using CentOS 7.4. Still, it didn't work for me until I applied @alindeman 's suggestion to add User=root in file /etc/systemd/system/kubelet.service

@vincentwu2011
Copy link

I'm on v1.9.2, Docker 1.13.1. I face this isue too. Fix it by adding User=root or Environment=HOME=/root in kubelet.service.

@john-tipper
Copy link

Does Kubernetes support the Docker credential store? I can't get kubernetes to respect a credential store defined in config.json and I don't know if my issue is related to this one.

https://stackoverflow.com/questions/50048861/does-kubernetes-kubelet-support-docker-credential-stores-for-private-registries

@clouless
Copy link

For all stumbling upon this. I have setup k8s via kubeadm 1.13.3 on an Ubuntu 18.04 host (upgraded from 1.11.x). Had problems pulling from my nexus on my slave nodes.

What fixed it for me was as mentioned above:

(1) as root run: docker login nexus.k8s.home.mydomain:32554

  • enter username and password
  • check that file /root/.docker/config.json exists and contains entries like
{
	"auths": {
		"nexus.k8s.home.mydomain:32554": {
			"auth": "xxxxxx"
		}
	},
	"HttpHeaders": {
		"User-Agent": "Docker-Client/18.09.2 (linux)"
	}
}

(2) Check that docker can now pull images from nexus

docker pull nexus.k8s.home.mydomain:32554/repo/app:v1

(3) vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

  • Add User=root
  • File looks kind of like this
[Service]
User=root
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

(4) Reload and restart kubelet

systemctl daemon-reload
systemctl restart kubelet

image

@owenliang
Copy link

For all stumbling upon this. I have setup k8s via kubeadm 1.13.3 on an Ubuntu 18.04 host (upgraded from 1.11.x). Had problems pulling from my nexus on my slave nodes.

What fixed it for me was as mentioned above:

(1) as root run: docker login nexus.k8s.home.mydomain:32554

  • enter username and password
  • check that file /root/.docker/config.json exists and contains entries like
{
	"auths": {
		"nexus.k8s.home.mydomain:32554": {
			"auth": "xxxxxx"
		}
	},
	"HttpHeaders": {
		"User-Agent": "Docker-Client/18.09.2 (linux)"
	}
}

(2) Check that docker can now pull images from nexus

docker pull nexus.k8s.home.mydomain:32554/repo/app:v1

(3) vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

  • Add User=root
  • File looks kind of like this
[Service]
User=root
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

(4) Reload and restart kubelet

systemctl daemon-reload
systemctl restart kubelet

image

great!

@KarstenSiemer
Copy link

KarstenSiemer commented Feb 11, 2021

Any update on this to make this more admin friendly, especially when running in fully automated cloud infrastructures?
Currently running v1.18.9-eks-d1db3c and made it work manually on a node but need to fully automate this, since nodes roll all the time and we have dozens of them.
I made a docker login using the root user on the node and then linked using
ln -s /root/.docker /var/lib/kubelet/.docker
Why actually do we have to add User=root to the systemd service? Since the kubelet process is running as root.
I am wondering because the file exists under /var/lib/kubelet/.docker where kubelet should always take a look and should not need the env variable.
Has someone this already automated? I'll try with cloudinit but am already afraid of timing problems...
And my final question is, if this process will change starting with kubernetes 1.20 when the docker lib gets swapped?

edit:
Got it going using cloud-init. We start the bootstrap.sh on eks also using cloud-init in terraform in a submodule which is called by another module.
This is what I was left with:

locals {
kubelet_user = ! var.kubelet_user_enabled ? null : <<-EOD
  [Service]
  User=root
EOD
dockerhub_credential = ! var.dockerhub_credentials_enabled ? null : "${var.dockerhub_user}:${var.dockerhub_token}"
dockerhub_credentials = ! var.dockerhub_credentials_enabled ? null : <<-EOD
  {
    "auths": {
      "https://index.docker.io/v1/": {
        "auth": "${base64encode(local.dockerhub_credential)}"
      }
    }
  }
EOD
node_userdata = ! var.enabled ? null : <<-EOD
  #!/bin/bash
  %{ if var.kubelet_user_enabled }
  echo "${local.kubelet_user}" > "/etc/systemd/system/kubelet.service.d/20-kubelet-userroot.conf"
  %{ endif }
  %{ if var.dockerhub_credentials_enabled }
  mkdir -p "/root/.docker"
  echo '${local.dockerhub_credentials}' > "/root/.docker/config.json"
  ln -s "/root/.docker" "/var/lib/kubelet/.docker"
  %{ endif }
  /etc/eks/bootstrap.sh \
    --apiserver-endpoint "${var.cluster.endpoint}" "${var.cluster.name}" \
    --b64-cluster-ca "${var.cluster.certificate_authority.0.data}" \
    --kubelet-extra-args '${join(" ", var.kubelet_extra_args)}'
EOD
}

Then I pass the node_userdata base64 encoded to the aws_launch_configuration like so:

resource "aws_launch_configuration" "this" {
  ...
  user_data_base64 = base64encode(local.node_userdata)
  ...
}

And the eks nodes come up cleanly and kubernetes is able to pull from dockerhub using those credentials.
I did do this because of the new pull limit restriction on dockerhub and just wanted the "normal" behavior from dockerhub back from before. Else you either constantly run into the limit or have to specify a pullSecret everywhere - which is amazingly annoying.
If you want to copy this, be mindful to set variables to sensitive which contain any credentials.

dermorz added a commit to dermorz/machine-controller that referenced this issue May 16, 2022
Without setting this (at least on debian) $HOME is not set when running
kubelet. Without $HOME being set the search paths for docker credentials
are not as they would be expected.

See: kubernetes/kubernetes#45487 (comment)
dermorz added a commit to dermorz/machine-controller that referenced this issue May 19, 2022
Without setting this (at least on debian) $HOME is not set when running
kubelet. Without $HOME being set the search paths for docker credentials
are not as they would be expected.

See: kubernetes/kubernetes#45487 (comment)
dermorz added a commit to dermorz/machine-controller that referenced this issue May 24, 2022
Without setting this (at least on debian) $HOME is not set when running
kubelet. Without $HOME being set the search paths for docker credentials
are not as they would be expected.

See: kubernetes/kubernetes#45487 (comment)
kubermatic-bot pushed a commit to kubermatic/machine-controller that referenced this issue May 24, 2022
* Explicitly set root user in kubelet systemd unit

Without setting this (at least on debian) $HOME is not set when running
kubelet. Without $HOME being set the search paths for docker credentials
are not as they would be expected.

See: kubernetes/kubernetes#45487 (comment)

* Adjust testdata

Adjust userdata kubelet test data

More test data adjustments

Adjust flatcar iginition testdata

* Add docker credentials support for ubuntu

Go template trimming

Only write docker auth config if CR is docker and credentials are given

* Add docker credential support to remaining distributions

* Fix typo

* Add support for SecretTypeDockerConfigJson

* Add documentation for additional supported secret type

* AuthConfig to return emptystring if registryCredentials is not set
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

Successfully merging a pull request may close this issue.