-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libct/cg/sd/v1: fix freezeBeforeSet (alt 2) #3166
libct/cg/sd/v1: fix freezeBeforeSet (alt 2) #3166
Conversation
Can we also add the knob on top of this? :) |
dec9ced
to
a5c822a
Compare
Done. I can't say I am the biggest fan of this complex monster we have created, but this seems to cover all the use cases in the best known way. |
a5c822a
to
f2d67bc
Compare
I am a 👍 on this approach. @odinuge @AkihiroSuda ptal |
Yeah, I think this approach is ok overall. 👍 |
libcontainer/cgroups/systemd/v1.go
Outdated
// hasDeviceRules checks cgroup device rules. Returns true if any rules | ||
// other than "allow all" exist. Used by freezeBeforeSet. | ||
func (m *legacyManager) hasDeviceRules() bool { | ||
const allowAll = "a *:* rwm\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have a deny-list rule set, this won't be able to detect it. Unfortunately there isn't a nice solution for this really (there's no real way to figure out what rules are denied if a cgroup is in default-allow mode). If this shortcut really is required, then okay (though this behaviour does at least deserve some kind of comment) but I'd prefer if we didn't have to do this if at all possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have a deny-list rule set, this won't be able to detect it.
Are you talking about rules from the spec? Note that this code is only used when we have SkipDevices set.
A long explanation of this commit is below.
If all three conditions below is true
- we are setting properties of an existing systemd unit,
- we have
SkipDevices
set (e.g. no device rules to be applied) devices.list
file of unit's cgroup contains nothing buta *:* rwm
plus a newline,
then systemd will not be writing any device rules in a manner resulting in spurious "access denied" errors.
This is so because (2) means we are not setting any device rules, and (1) and (3) means no rules were ever set before for this systemd unit. This also relies on assumption that systemd properties are in sync with the cgroup state.
Now, if the above (1) && (2) && (3) is not true, we fall back to the slow path of checking systemd unit properties. This happens e.g. if a parent cgroup has some device access restrictions.
Now, I am more than ready to drop this shortcut, especially since we have SkipFreezeOnSet flag which is a "shorter shortcut" (if I can say so), but I am just trying to understand what am I missing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit removed; PTAL @cyphar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so because (2) means we are not setting any device rules, and (1) and (3) means no rules were ever set before for this systemd unit. This also relies on assumption that systemd properties are in sync with the cgroup state.
The issue is that (3) doesn't necessarily mean that no rules were set.
If you put a cgroup into "deny-list mode" (in other words, you write a *:* rwm
to devices.allow
and then rules to devices.deny
) then devices.list
always shows a *:* rwm
even if you add deny rules to the cgroup. This is a pretty big anti-feature in cgroupv1 and is one of the reasons there's quite a few hacks in the devicesemulator
code to deal with the fact that you can't tell whether a cgroup actually has no rules set if devices.list
says the rule set is a *:* rwm
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meant to make the last review a "request changes" one
This fixes the behavior intended to avoid freezing containers/control groups without it being necessary. This is important for end users of libcontainer who rely on the behavior of no freeze. The previous implementation would always get error trying to get DevicePolicy from the Unit via dbus, since the Unit interface doesn't contain DevicePolicy. Signed-off-by: Odin Ugedal <odin@uged.al>
Add a test for freezeBeforeSet, checking various scenarios including those that were failing before the fix in the previous commit. [v2: add more cases, add a check before creating a unit.] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is helpful to kubernetes in cases it knows for sure that the freeze is not required (since it created the systemd unit with no device restrictions). As the code is trivial, no tests are required. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
f2d67bc
to
9a095e4
Compare
(Delayed LGTM.) |
This is a combo of #3143,
#3151,and #3161, which aims to be both fast and correct for all use cases.From #3143 -- fixes to getting and checking device properties;
From #3151 -- fast(er) path in freezeBeforeSet (in case devices.list is empty we can skip getting unit properties);From #3161 -- SkipFreezeOnSet knob for kubernetes.
From #3148 -- test cases (those are also cherry-picked in #3143)
Aimed to fix kubernetes/kubernetes#104280 (once we backport, release 1.0.2, and vendor it to kubernetes)
1.0 backport: #3167