Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RROR: failed to create cluster: failed to init node with kubeadm #1437

Closed
latermonk opened this issue Mar 23, 2020 · 26 comments
Closed

RROR: failed to create cluster: failed to init node with kubeadm #1437

latermonk opened this issue Mar 23, 2020 · 26 comments
Assignees
Labels
kind/support Categorizes issue or PR as a support question. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@latermonk
Copy link

What happened:
use kind create cluster then prompt:
ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged kind-control-plane kubeadm init --ignore-preflight-errors=all --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1

What you expected to happen:
create the cluster ok

How to reproduce it (as minimally and precisely as possible):
Mac vagrant centos/7
Client: Docker Engine - Community
Version: 19.03.8
API version: 1.40
Go version: go1.12.17
Git commit: afacb8b
Built: Wed Mar 11 01:27:04 2020
OS/Arch: linux/amd64
Experimental: false

Server: Docker Engine - Community
Engine:
Version: 19.03.8
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: afacb8b
Built: Wed Mar 11 01:25:42 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.13
GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
Anything else we need to know?:

Environment:

  • kind version: (use kind version):kind v0.7.0 go1.13.6 linux/amd64
  • Kubernetes version: (use kubectl version):Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.4", GitCommit:"8d8aa39598534325ad77120c120a22b3a990b5ea", GitTreeState:"clean", BuildDate:"2020-03-12T21:03:42Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
    The connection to the server localhost:8080 was refused - did you specify the right host or port?
  • Docker version: (use docker info):19.03.8
  • OS (e.g. from /etc/os-release):CentOS Linux release 7.6.1810 (Core)
@latermonk latermonk added the kind/bug Categorizes issue or PR as related to a bug. label Mar 23, 2020
@BenTheElder
Copy link
Member

can you run kind with -v 1 when creating the cluster?

Mac vagrant centos/7

is that docker inside vagrant on mac? any reason not to use docker desktop?

do you have enough disk space in the VM? this error usually occurs when most docker things would have failed because the host is not capable.

@BenTheElder BenTheElder added kind/support Categorizes issue or PR as a support question. and removed kind/bug Categorizes issue or PR as related to a bug. labels Mar 24, 2020
@SmadusankaB
Copy link

Same error

ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged kind-control-plane kubeadm init --ignore-preflight-errors=all --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1

@BenTheElder
Copy link
Member

I'm going to need more information about your failure and your host environment to know why it isn't working for you but common reasons are:

  • using an unsupported version of docker (I mean supported by docker upstream... some of the older versions etc. have issues)
  • not having enough disk / memory / CPU available to docker to run a kubernetes node (generally disk, occasionally memory, you need something like 600MB memory for one node and at least 2gb disk space to pull the node image and have some space to do things after)
  • using a filesystem that doesn't work out of the box with docker in docker kind doesn't work on btrfs #1416 (comment)
  • using a host environment that cannot nest containers like this (e.g. crostini) Can't run KIND on ChromeOS Linux VM #763

please check the known issues

@BenTheElder
Copy link
Member

This error just means that kubeadm was unable to bring up the control plane successfully, which generally means the host environment is unhealthy.

@BenTheElder BenTheElder added the triage/needs-information Indicates an issue needs more information in order to work on it. label Mar 24, 2020
@SmadusankaB
Copy link

SmadusankaB commented Mar 24, 2020

@BenTheElder I appreciate your help.

I'm getting that error when i'm trying to create a cluster(one master node and two worker nodes) using a config file.

  • Logs on master-node
    INFO: ensuring we can execute /bin/mount even with userns-remap
    INFO: remounting /sys read-only
    INFO: making mounts shared
    INFO: fix cgroup mounts for all subsystems
    INFO: clearing and regenerating /etc/machine-id
    Initializing machine ID from random generator.
    INFO: faking /sys/class/dmi/id/product_name to be "kind"
    INFO: faking /sys/class/dmi/id/product_uuid to be random
    INFO: faking /sys/devices/virtual/dmi/id/product_uuid as well
    Failed to find module 'autofs4'
    systemd 242 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
    Detected virtualization docker.
    Detected architecture x86-64.
    Failed to create symlink /sys/fs/cgroup/cpu: File exists
    Failed to create symlink /sys/fs/cgroup/cpuacct: File exists
    Failed to create symlink /sys/fs/cgroup/net_cls: File exists
    Failed to create symlink /sys/fs/cgroup/net_prio: File exists
    Welcome to Ubuntu 19.10!
    Set hostname to .
    Failed to bump fs.file-max, ignoring: Invalid argument
    Configuration file /kind/systemd/kubelet.service is marked world-inaccessible. This has no effect as configuration data is accessible via APIs without restrictions. Proceeding anyway.
    Configuration file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf is marked world-inaccessible. This has no effect as configuration data is accessible via APIs without restrictions. Proceeding anyway.
    [ OK ] Listening on Journal Socket (/dev/log).
    [ OK ] Reached target Swap.
    [ OK ] Reached target Slices.
    [ OK ] Listening on Journal Socket.
    Mounting FUSE Control File System...
    Mounting Kernel Debug File System...
    Starting Apply Kernel Variables...
    Mounting Huge Pages File System...
    [ OK ] Listening on Journal Audit Socket.
    [ OK ] Reached target Sockets.
    Starting Journal Service...
    Starting Remount Root and Kernel File Systems...
    [UNSUPP] Starting of Arbitrary Exec…Automount Point not supported.
    [ OK ] Started Dispatch Password …ts to Console Directory Watch.
    [ OK ] Reached target Local Encrypted Volumes.
    [ OK ] Reached target Paths.
    Starting Create list of re…odes for the current kernel...
    [ OK ] Started Create list of req… nodes for the current kernel.
    [ OK ] Started Remount Root and Kernel File Systems.
    Starting Update UTMP about System Boot/Shutdown...
    Starting Create System Users...
    [ OK ] Started Update UTMP about System Boot/Shutdown.
    [ OK ] Mounted Kernel Debug File System.
    [ OK ] Mounted FUSE Control File System.
    [ OK ] Started Apply Kernel Variables.
    [ OK ] Mounted Huge Pages File System.
    [ OK ] Started Create System Users.
    Starting Create Static Device Nodes in /dev...
    [ OK ] Started Create Static Device Nodes in /dev.
    [ OK ] Reached target Local File Systems (Pre).
    [ OK ] Reached target Local File Systems.
    [ OK ] Started Journal Service.
    Starting Flush Journal to Persistent Storage...
    [ OK ] Reached target System Initialization.
    [ OK ] Started Daily Cleanup of Temporary Directories.
    [ OK ] Reached target Timers.
    [ OK ] Reached target Basic System.
    [ OK ] Started kubelet: The Kubernetes Node Agent.
    Starting containerd container runtime...
    [ OK ] Started containerd container runtime.
    [ OK ] Reached target Multi-User System.
    [ OK ] Reached target Graphical Interface.
    Starting Update UTMP about System Runlevel Changes...
    [ OK ] Started Flush Journal to Persistent Storage.
    [ OK ] Started Update UTMP about System Runlevel Changes.

  • docker info results
    Client:
    Debug Mode: false
    Server:
    Containers: 25
    Running: 21
    Paused: 0
    Stopped: 4
    Images: 29
    Server Version: 19.03.5
    Storage Driver: overlay2
    Backing Filesystem: extfs
    Supports d_type: true
    Native Overlay Diff: true
    Logging Driver: json-file
    Cgroup Driver: cgroupfs
    Plugins:
    Volume: local
    Network: bridge host ipvlan macvlan null overlay
    Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
    Swarm: inactive
    Runtimes: runc
    Default Runtime: runc
    Init Binary: docker-init
    init version: fec3683
    Security Options:
    seccomp
    Profile: default
    Kernel Version: 4.19.76-linuxkit
    Operating System: Docker Desktop
    OSType: linux
    Architecture: x86_64
    CPUs: 6
    Total Memory: 7.777GiB
    Name: docker-desktop
    Docker Root Dir: /var/lib/docker
    Debug Mode: true
    File Descriptors: 157
    Goroutines: 146
    System Time: 2020-03-24T08:10:00.574751724Z
    EventsListeners: 3
    HTTP Proxy: gateway.docker.internal:3128
    HTTPS Proxy: gateway.docker.internal:3129
    Registry: https://index.docker.io/v1/
    Labels:
    Experimental: false
    Insecure Registries:
    127.0.0.0/8
    Live Restore Enabled: false
    Product License: Community Engine

  • kind version results
    kind v0.7.0 go1.13.6 darwin/amd64

  • system_profiler SPSoftwareDataType results
    System Version: macOS 10.15.3 (19D76)
    Kernel Version: Darwin 19.3.0

@SmadusankaB
Copy link

SmadusankaB commented Mar 24, 2020

After changing the containerPath, from /etc/containerd/config.toml to /dev/mapper/config.toml it worked.

@BenTheElder
Copy link
Member

BenTheElder commented Mar 24, 2020

@Seshirantha I would recommend using something like this in the kind config instead of trying to mount the config file:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
containerdConfigPatches: 
- |-
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:5000"]
    endpoint = ["http://blah:5000"]

(replace the patch contents with whatever you're setting in this config file)

@BenTheElder
Copy link
Member

closing due to lack of follow-up details.

@miry
Copy link

miry commented Apr 16, 2020

How it could be possible to debug.
The last log message is stuck on:

[  OK  ] Started Update UTMP about System Runlevel Changes.

How to check the kubeadm command runs after the init process and where the logs of init?

@BenTheElder
Copy link
Member

kind create cluster --retain -v 1
kind export logs
lots of log files there.

check against https://kind.sigs.k8s.io/docs/user/known-issues/

@miry
Copy link

miry commented Apr 16, 2020

@BenTheElder thanks for the command.

I see logs. It seems kubelete could not be started.

because cluster was not created I could not get kubelet logs:

 $ kind export logs           
ERROR: unknown cluster "kind"

Is there any other way to extract logs during the boot?

@miry
Copy link

miry commented Apr 16, 2020

Another warning from the logs:

I0416 20:58:34.895732      93 checks.go:649] validating whether swap is enabled or not
	[WARNING Swap]: running with swap on is not supported. Please disable swap

@BenTheElder
Copy link
Member

BenTheElder commented Apr 16, 2020 via email

@BenTheElder
Copy link
Member

warnings are not failures.

swap is a normal warning, kubernetes doesn't support this but we set it to allow it and it works, except for memory limits

you will also see failures early on due to cni not being configured.

kubeadm has many "normal" kubelet crashes early on, it's part of the design of kubeadm + kubelet that kubelet just restarts many times.

@BenTheElder
Copy link
Member

please open a new support issue or join the slack for more interactive support help, I do not monitor closed issues as reliably.

@TomLan42
Copy link

please check that the name of your cluster should not contain any underscore...
i bumped into the same error when i include an '_' in my cluster name

@BenTheElder
Copy link
Member

BenTheElder commented Sep 25, 2020 via email

@dbousquet
Copy link

dbousquet commented Oct 23, 2020

Hello,
I faced similar issue (I'm using Fedora 32), I solved it with:
sysctl net.bridge.bridge-nf-call-iptables=0
sysctl net.bridge.bridge-nf-call-arptables=0
sysctl net.bridge.bridge-nf-call-ip6tables=0
systemctl restart docker

Note: modify sysctl.conf for persistent kernel setting

@dnrkcss
Copy link

dnrkcss commented Dec 28, 2020

please check that the name of your cluster should not contain any underscore...
i bumped into the same error when i include an '_' in my cluster name

this one worked for me, thank you!

@viktor-slavchev-vmware
Copy link

Had the same issue - got resolved after increasing the Docker ram

@nfelicio1976
Copy link

I used this comand and worked :)

kind create cluster --retain -v 1

@lkravi
Copy link

lkravi commented Jun 9, 2021

I experienced the same issue in Mac OS with Docker Desktop. Once I increase the resources issue went away. basically added more memory.

@piyushvj
Copy link

SAME ISSUE :

ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged kind-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1

if issue is because of memory, can someone tell me how to increase it ? I have ubuntu via wsl2.

@BenTheElder
Copy link
Member

BenTheElder commented Jun 21, 2021

Hi: please see: #1437 (comment)

I am going to lock this issue now, please file new issues with your specific environment details. The current issue title is a symptom not a specific bug, and it is difficult to help with N different bugs threaded within the same issue.

Regarding the specific issues so far:

For everyone else:

If you find this issue and you are having issues creating a cluster:

  1. Please reference https://kind.sigs.k8s.io/docs/user/known-issues/ first
  2. If you can't find anything there, please do file a new issue, fill out the template and include all of the relevant info including:
  • your operating system and version
  • your docker or podman version and docker info / podman info so we can see if you are e.g. trying to use rootless etc.
  • your kind version, so we know which version you're using
  • your exact command / configuration used, so we can see what image you're using etc.

We can help diagnose your specific circumstances on your issue, and if it turns out to be a unique bug in kind it will help us track fixing it. Otherwise we can also track your specific problem for updating the known issues guide, or otherwise make it easier for users to find specific problems instead of broad "cluster won't start, kubeadm failed" issues that are non-specific.

@kubernetes-sigs kubernetes-sigs locked as off-topic and limited conversation to collaborators Jun 21, 2021
@BenTheElder
Copy link
Member

BenTheElder commented Jun 21, 2021

For WSL2 (#1437 (comment)) please start with https://kind.sigs.k8s.io/docs/user/using-wsl2/, which references memory settings.

@aojea
Copy link
Contributor

aojea commented Jun 21, 2021

  • cc @aojea re: #1437 (comment), we should probably update docs somewhere for fedora regarding this / a known issues-entry?

I don't think is a good idea because that completely disables iptables on the host bridges, for sure it will work for that user if he has an environment that adds more iptables rules to the bridges, but we shouldn't promote in the website because that can also cause another undesired behaviours, ie. if there are legit iptables rules to drop traffic on a bridge those will stop to work.

I think that current known issues and documentation are ok, install docker and configure firewalld correctly https://kind.sigs.k8s.io/docs/user/known-issues/#fedora32-firewalld

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/support Categorizes issue or PR as a support question. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests