Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure volumes reacquire locks on state refresh #4624

Merged

Conversation

mheon
Copy link
Member

@mheon mheon commented Dec 3, 2019

After a restart, pods and containers both run a refresh() function to prepare to run after a reboot. Until now, volumes have not had a similar function, because they had no per-boot setup to perform.

Unfortunately, this was not noticed when in-memory locking was introduced to volumes. The refresh() routine is, among other things, responsible for ensuring that locks are reserved after a reboot, ensuring they cannot be taken by a freshly-created container, pod, or volume. If this reservation is not done, we can end up with two objects using the same lock, potentially needing to lock each other for some operations - classic recipe for deadlocks.

Add a refresh() function to volumes to perform lock reservation and ensure it is called as part of overall refresh().

Fixes #4605
Fixes #4621

After a restart, pods and containers both run a refresh()
function to prepare to run after a reboot. Until now, volumes
have not had a similar function, because they had no per-boot
setup to perform.

Unfortunately, this was not noticed when in-memory locking was
introduced to volumes. The refresh() routine is, among other
things, responsible for ensuring that locks are reserved after a
reboot, ensuring they cannot be taken by a freshly-created
container, pod, or volume. If this reservation is not done, we
can end up with two objects using the same lock, potentially
needing to lock each other for some operations - classic recipe
for deadlocks.

Add a refresh() function to volumes to perform lock reservation
and ensure it is called as part of overall refresh().

Fixes containers#4605
Fixes containers#4621

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mheon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M labels Dec 3, 2019
@rhatdan
Copy link
Member

rhatdan commented Dec 3, 2019

@giuseppe
Copy link
Member

giuseppe commented Dec 3, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2019
@openshift-merge-robot openshift-merge-robot merged commit 309452d into containers:master Dec 3, 2019
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 26, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fedora 31: "podman start" hangs podman fails to start container after reboot if using volume
5 participants