-
Notifications
You must be signed in to change notification settings - Fork 881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot start container: subnet sandbox join failed for "10.0.0.0/24": error creating vxlan interface: file exists #945
Comments
Happens to us pretty random (or not in known/reproducible way) also.
Docker info from third node of our swarm cluster, only one that have problem atm.
Cluster advertise: :4243 EDIT :
|
Saw this again today, after destroying and re-creating my Oh, it now says Restarting docker didn't work for me like it did for @arteal, so I ended up rebooting all the hosts again. |
Happened to us again, restarting docker didn't help, stopping docker, flushing iptables and starting docker did not help. Had to reboot that machine |
Same as @arteal for me I'll try to get more info next time...
Client:
Version: 1.10.3
API version: 1.22
Go version: go1.5.3
Git commit: 20f81dd
Built: Thu Mar 10 15:54:52 2016
OS/Arch: linux/amd64
Server:
Version: 1.10.3
API version: 1.22
Go version: go1.5.3
Git commit: 20f81dd
Built: Thu Mar 10 15:54:52 2016
OS/Arch: linux/amd64
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 9
Server Version: 1.10.3
Storage Driver: aufs
Root Dir: /data/.graph/var/lib/releases/20160316_1458121754/aufs
Backing Filesystem: extfs
Dirs: 210
Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Plugins:
Volume: local
Network: overlay null host bridge
Kernel Version: 3.16.0-67-generic
Operating System: Ubuntu 14.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 1.421 GiB
Name: xxxxxxx
ID: YPJJ:UE7N:FYIZ:AOXA:EXVL:7B7I:67U7:YUPK:NOZB:WMLJ:J4YH:GWOV
WARNING: No swap limit support
Labels:
project=xxxxxx
env=test
type=bastion
Cluster store: consul://x.x.x.x:8502/network
Cluster advertise: z.z.z.z:2375
Linux xxxxxxxx 3.16.0-67-generic #87~14.04.1-Ubuntu SMP Fri Mar 11 00:26:02 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Environment details (AWS, VirtualBox, physical, etc.): Physical |
I just experienced this issue. First i got: Then after a few tries it switched to saying: Doing a "service docker stop; service docker start" helped. docker infoContainers: 34 docker versionClient: Server: |
I keep having this issue. Now even restarting the docker daemon doesn't help. |
@larsla |
For those hitting the |
Closed via #1065 and is vendored into docker/docker. Can someone try the docker/docker master and confirm the fix ? |
- Fixes moby/libnetwork#1051 - Fixes moby/libnetwork#985 - Fixes moby/libnetwork#945 - Log time taken to set sandbox key - Limit number of concurrent DNS queries Signed-off-by: Madhu Venugopal <madhu@docker.com> (cherry picked from commit 90bb530)
@mavenugo Unfortunately, once the problem has manifested, it's too late. I saw this today after upgrading my swarm hosts to docker 1.10.3, and if I'm reading the PR correctly, it prevents the problem from occurring, but it won't help for me to upgrade docker and try again. So I ended up rebooting the hosts again. I greatly look forward to the release of 1.11.0, though, so this can hopefully be in the past. :-) |
@brettdh I noticed the same. Only a reboot of the host fixes it but this is not an option for production. The process can take hours in an enterprise IT environment where the vendor (us) have no direct control over the hypervisor. |
Same here.
|
This is still occurring on 1.11.1. (The |
And once again today. ping @mavenugo |
@thaJeztah @mavenugo This should be reopened as it can be reproduced using the below script.
My env is :
|
@sebi-hgdata Can you try this same script with 1.12-rc2 because we have added fixes in it which should fix it. If it is still present then atleast there will be some opportunity to fix it completely before the 1.12 release is out. |
@mrjana seems like the issue does not reproduce with 1.12-rc2 all containers restart properly during testing ... but I see a lot of errors like
after letting the script kill the daemon a couple of times, killing the script, stopping the docker daemon and starting it up again... after this no container is restarted because they failed with the above error. |
@sebi-hgdata Thanks for testing. Yes please create another issue for the new bug you are seeing. We will make sure it is fixed before 1.12 release. |
I've just experienced it in a Swarm cluster running 1.12.3.
It has been produced when I did:
|
Still seeing this on swarm 1.2.5 + Docker 12.2-rc1 |
@mavenugo we're also seeing this quite often in PWD (play-with-docker) using 1.13rc1 |
Had the file exists error with
with docker engine in swarm mode. After restarting docker service everything worked. |
@Michael-Hamburger / all this will be fixed by #1574 I've applied this patch manually and haven't had any issues so far. |
I'm running into this problem on my 1.12.3 swarm. Is there a workaround for when it happens? I've tried removing and adding the problematic network(s) without success. Is there something short of a full restart of the swarm hosts that can alleviate this? |
Yes, if you restart the daemon the problem should be gone. @mavenugo. But if you keep removing / creating networks you'll come across with it again |
I saw this now. Docker version 17.05.0-ce, build 89658be Restarting the docker daemon is not fixing this. "starting container failed: subnet sandbox join failed for "10.0.2.0/24": error creating vxlan interface: file exists" |
@alexanderkjeldaas can you open a new issue with details, possibly it's different from the one that's resolved by #1574 |
I took the liberty of opening a new thread as I, too, am encountering this issue on 17.05-ce (and have on earlier releases). See #1765 for info. |
@arteal |
Same issue: |
a similar problem is reproduced now june 2019 docker infoContainers: 0
|
Still happening on docker swarm at ubuntu 18.04
|
This error still occurs occasionally in Docker 20.x, VM restart typically required. |
(Similar, but closed, issues: #562, #751)
Very occasionally, I see this error message when starting a container in my swarm:
This error persists until I reboot the docker hosts. A comment on #751 suggested that restarting iptables would suffice; I have not tried this yet. I also have tried the solution mentioned in #562 previously, and I believe that worked as well, but I cannot remember for sure.
docker version
:docker info
:Note: this was not captured when the error was occurring. If it happens again, I will comment with the info.
uname -a
:Darwin <hostname> 15.3.0 Darwin Kernel Version 15.3.0: Thu Dec 10 18:40:58 PST 2015; root:xnu-3248.30.4~1/RELEASE_X86_64 x86_64
Linux <hostname> 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Environment details (AWS, VirtualBox, physical, etc.):
How reproducible:
The text was updated successfully, but these errors were encountered: