Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not allow overlay destroySandbox() to be interrupted #1065

Merged
merged 1 commit into from
Mar 31, 2016
Merged

Do not allow overlay destroySandbox() to be interrupted #1065

merged 1 commit into from
Mar 31, 2016

Conversation

aboch
Copy link
Contributor

@aboch aboch commented Mar 31, 2016

  • Concurrent leave/join of one member overlay network can end with the error:
    "subnet sandbox join failed for "A.B.C.D/MM": error creating vxlan interface: file exists"
    This happens when the join is processed while the leave has already started.
    Having the network one member only, the leave resets the once variable for this network subnets
    and triggers the sandbox destroy for each subnet's vxlan interface, when the n.joinCnt goes to 0.
    But given the destroySandbox() is not atomic, the join thread can trigger the creation of the
    vxlan interface in between (given subnet.once was re-initialized) before the leave thread
    removes the vxlan interface for this subnet.
  • The fix is to not allow interruptions between the re-initialization of the subnet.once var and
    consequent vxlan interface removal.

Steps to reproduce (one node is enough):

  • create overlay network
  • run two containers in daemon mode on the overlay network
  • disconnect one container
  • in parallel, disconnect the remaining container and connect the other container

Signed-off-by: Alessandro Boch aboch@docker.com

- Concurrent leave/join of one member overlay network can end with the error:
  "subnet sandbox join failed for "A.B.C.D/MM": error creating vxlan interface: file exists"
  This happens when the join is processed while the leave has already started.
  Having the network one member only, the leave resets the once variable for this network subnets
  and triggers the sandbox destroy for each subnet's vxlan interface, when the n.joinCnt goes to 0.
  But given the destroySandbox() is not atomic, the join thread can trigger the creation of the
  vxlan interface in between (given subnet.once was re-initialized) before the leave thread
  removes the vxlan interface for this subnet.
- The fix is to not allow interruptions between the re-initialization of the subnet.once var and
  consequent vxlan interface removal.

Signed-off-by: Alessandro Boch <aboch@docker.com>
@mrjana
Copy link
Contributor

mrjana commented Mar 31, 2016

LGTM pending CI green

@mavenugo
Copy link
Contributor

LGTM

@mavenugo mavenugo merged commit f7e3338 into moby:master Mar 31, 2016
sanimej pushed a commit to sanimej/libnetwork that referenced this pull request Apr 4, 2016
Do not allow overlay destroySandbox() to be interrupted

Signed-off-by: Santhosh Manohar <santhosh@docker.com>
@aboch aboch deleted the ov branch August 5, 2016 06:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants