-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] libct/cg/sd/v1: Set: don't leave cgroup frozen #3072
Conversation
I. In case a parent cgroup is frozen, the current code reads the cgroup state (into targetFreezerState) as FROZEN, and sets it explicitly after setting unit properties. This is obviously wrong, as we should not explicitly freeze the cgroup because its parent was frozen. The issue happens because: 1. The m.GetFreezerState method does not distinguish between "self frozen" and "parent frozen" states (i.e. it does not consult freezer.self_freezing). 2. The m.Freeze method changes m.cgroups.Resources.Frozen field. II. The current code does freeze/thaw unconditionally, meaning it freezes the already frozen cgroup (which is a no-op except it still requires some reads and writes to freezer.state), and thaws the cgroup which is about to be frozen. Solve all this by figuring out whether we need to freeze and thaw, and introducing and using a method which does not change value of m.cgroups.Resources.Frozen. NOTE the current code assume that the container that is currently frozen will remain frozen for the duration of SetUnitProperties. This might not be true but there's no way to guarantee it (even in case we freeze the cgroup ourselves). Reported-by: Odin Ugedal <odin@uged.al> Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Initial test work was done by Kir Kolyshkin (@kolyshkin). Co-Authored-By: Kir Kolyshkin <kolyshkin@gmail.com> Signed-off-by: Odin Ugedal <odin@uged.al>
if targetFreezerState == configs.Undefined { | ||
targetFreezerState = configs.Thawed | ||
needsFreeze := true | ||
if freezerState == configs.Frozen { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could result in a race between the freezing of the current cgroup and one of its ancestors.
If one ancestor is FROZEN, this will not freeze the control group. If the parent is thowed before systemd does its device update sequence, things will not work as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does also not fix the difference between cgroup v1 and v2.
If the parent control group is frozen, and the control group itself is not, GetFreezerState
will return FROZEN
on v1 and THAWED
on v2. I don't think the naming or definition is important, other than the fact that it works the same way on both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that make sense to you @kolyshkin and @cyphar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But overall, I agree that always thawing the cgroup is a valid way to do it. If people want to freeze it.
However, doing:
runc pause container
runc update container
# Contianer will now still be paused
runc resume container
now works as expected, but with your pr it will just resume the container on update (or)?
edit: that should work as expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could result in a race between the freezing of the current cgroup and one of its ancestors.
You are right, and I have anticipated that -- this is noted in the commit and PR description.
t.Fatalf("expected container cgroup path %q to be under pod cgroup path %q", | ||
cm.Path("freezer"), pm.Path("freezer")) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the definition of GetFreezerState
, we should do a check of its output here as well, to make sure it works the same on cgroup v1 and v2.
cf, err := cm.GetFreezerState()
if err != nil {
t.Fatal(err)
}
if cf != configs.Thawed {
t.Fatalf("expected container to be thawed, got %v", cf)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch; will add.
Closing in favor of #3080 which carries this one. |
This is the alternative to #3065 with a different approach and the same test case.
I. In case a parent cgroup is frozen, the current code reads the cgroup
state (into
targetFreezerState
) asFROZEN
, and sets it explicitly aftersetting unit properties. This is obviously wrong, as we should not
explicitly freeze the cgroup because its parent was frozen.
The issue happens because:
The
m.GetFreezerState
method does not distinguish betweenthe "self frozen" and "parent frozen" states (i.e. it does not
consult
freezer.self_freezing
).The
m.Freeze
method changesm.cgroups.Resources.Frozen
field.II. The current code does freeze/thaw unconditionally, meaning it
freezes the already frozen cgroup (which is a no-op except it
still requires some reads and writes to
freezer.state
), andthaws the cgroup which is about to be frozen.
Solve all this by figuring out whether we need to freeze and thaw,
and introducing and using a method which does not change value of
m.cgroups.Resources.Frozen
.Add a test case (initially developed in #3065 (comment), and tested to fail without the fix in #3066).
NOTE the current code assume that the container that is currently frozen
will remain frozen for the duration of
SetUnitProperties
. This might notbe true but there's no way to guarantee it (even in case we freeze the
cgroup ourselves).