-
Notifications
You must be signed in to change notification settings - Fork 619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Linux cgroup v2 #3117
Comments
Hi, thank you for opening the issue. Looking into the cloudwatch metrics error described in your linked container roadmap issue[1] -
We suspect the root cause to be with docker engine API response change. Agent depends on docker for creating container cgroups, and ECS Agent uses docker ContainerStats API for monitoring and collecting container statistics[2]. The API response model is different if docker uses cgroup v2 instead of v1[3]. Specifically, To fix this particular issue, Agent needs to be able to handle the docker egnine ContainerStats API response model for both cgroup v1 and v2. Agent itself does not technically need to support cgroup v2 in order to publish container metrics. However, if Agent does not support cgroup v2, it won’t be able to enforce task-level cgroup resource limits, or pass them down to the container cgroups. If we deliver the fix for handling docker stats API change without Agent cgroup v2 support, customer will have visibility but no full control of task resource consumption, which is a rather incomplete experience. Therefore, the stats API fix will likely be part of the greater effort of supporting cgroup v2 with Agent. We will update the thread, once we have a concrete plan for providing the support. Another thing we would like to call out, regarding the possible workarounds -
It should be possible to still use the latest flatcar release, but with cgroups v2 disabled. [5] [1] aws/containers-roadmap#1535 |
closes aws/containers-roadmap#1535 closes aws#3117 This adds support for task-level resource limits when running on unified cgroups (aka cgroups v2) with the systemd cgroup driver. Cgroups v2 has introduced a cgroups format that is not backward compatible with cgroups v1. In order to support both v1 and v2, we have added a config variable to detect which cgroup version the ecs agent is running with. The containerd/cgroups library is used to determine which mode it is using on agent startup. Cgroups v2 no longer can provide per-cpu usage stats, so this validation was removed since we never used it either.
closes aws/containers-roadmap#1535 closes aws#3117 This adds support for task-level resource limits when running on unified cgroups (aka cgroups v2) with the systemd cgroup driver. Cgroups v2 has introduced a cgroups format that is not backward compatible with cgroups v1. In order to support both v1 and v2, we have added a config variable to detect which cgroup version the ecs agent is running with. The containerd/cgroups library is used to determine which mode it is using on agent startup. Cgroups v2 no longer can provide per-cpu usage stats, so this validation was removed since we never used it either.
closes aws/containers-roadmap#1535 closes aws#3117 This adds support for task-level resource limits when running on unified cgroups (aka cgroups v2) with the systemd cgroup driver. Cgroups v2 has introduced a cgroups format that is not backward compatible with cgroups v1. In order to support both v1 and v2, we have added a config variable to detect which cgroup version the ecs agent is running with. The containerd/cgroups library is used to determine which mode it is using on agent startup. Cgroups v2 no longer can provide per-cpu usage stats, so this validation was removed since we never used it either.
closes aws/containers-roadmap#1535 closes aws#3117 This adds support for task-level resource limits when running on unified cgroups (aka cgroups v2) with the systemd cgroup driver. Cgroups v2 has introduced a cgroups format that is not backward compatible with cgroups v1. In order to support both v1 and v2, we have added a config variable to detect which cgroup version the ecs agent is running with. The containerd/cgroups library is used to determine which mode it is using on agent startup. Cgroups v2 no longer can provide per-cpu usage stats, so this validation was removed since we never used it either.
closes aws/containers-roadmap#1535 closes aws#3117 This adds support for task-level resource limits when running on unified cgroups (aka cgroups v2) with the systemd cgroup driver. Cgroups v2 has introduced a cgroups format that is not backward compatible with cgroups v1. In order to support both v1 and v2, we have added a config variable to detect which cgroup version the ecs agent is running with. The containerd/cgroups library is used to determine which mode it is using on agent startup. Cgroups v2 no longer can provide per-cpu usage stats, so this validation was removed since we never used it either.
closes aws/containers-roadmap#1535 closes aws#3117 This adds support for task-level resource limits when running on unified cgroups (aka cgroups v2) with the systemd cgroup driver. Cgroups v2 has introduced a cgroups format that is not backward compatible with cgroups v1. In order to support both v1 and v2, we have added a config variable to detect which cgroup version the ecs agent is running with. The containerd/cgroups library is used to determine which mode it is using on agent startup. Cgroups v2 no longer can provide per-cpu usage stats, so this validation was removed since we never used it either.
* Support Unified Cgroups (cgroups v2) closes aws/containers-roadmap#1535 closes #3117 This adds support for task-level resource limits when running on unified cgroups (aka cgroups v2) with the systemd cgroup driver. Cgroups v2 has introduced a cgroups format that is not backward compatible with cgroups v1. In order to support both v1 and v2, we have added a config variable to detect which cgroup version the ecs agent is running with. The containerd/cgroups library is used to determine which mode it is using on agent startup. Cgroups v2 no longer can provide per-cpu usage stats, so this validation was removed since we never used it either. * wip * update cgroups library with nil panic bugfix * Initialize and toggle cgroup controllers
Can somebody help us know as to when this issue would be remediated. |
* Support Unified Cgroups (cgroups v2) closes aws/containers-roadmap#1535 closes aws#3117 This adds support for task-level resource limits when running on unified cgroups (aka cgroups v2) with the systemd cgroup driver. Cgroups v2 has introduced a cgroups format that is not backward compatible with cgroups v1. In order to support both v1 and v2, we have added a config variable to detect which cgroup version the ecs agent is running with. The containerd/cgroups library is used to determine which mode it is using on agent startup. Cgroups v2 no longer can provide per-cpu usage stats, so this validation was removed since we never used it either. * wip * update cgroups library with nil panic bugfix * Initialize and toggle cgroup controllers
Summary
Support Linux host systems that use cgroup v2.
Description
cgroup v2 has been been added to the Linux kernel in 2014 as described on the Linux Kernel mailinglist [1] and is adopted by Linux distributions since 2019 [2]. The ECS Agent does not work on Linux destributions that make use of cgroup v2; It only works with cgroup v1.
This has been mentioned on the AWS Containers Roadmap [3] and is cause for issues when using Flatcar Linux as host system in ECS clusters [4][5]. The only workaround in this case is to use an older version of Flatcar which has known security vulnerabilities as described in [6].
Expected Behavior
The ECS Agent works on Linux distributions with cgroup v2.
Observed Behavior
The ECS Agent only works on Linux distributions with cgroup v1.
Environment Details
This issue e.g. occurs when using Flatcar Linux >= v2983.2.0 [7]
Additional Links
[1] https://www.kernel.org/doc/Documentation/cgroup-v2.txt
[2] https://medium.com/nttlabs/cgroup-v2-596d035be4d7
[3] aws/containers-roadmap#1535
[4] flatcar/Flatcar#585
[5] https://www.flatcar.org/docs/latest/installing/cloud/aws-ec2/#known-issues
[6] aws/containers-roadmap#1535 (comment)
[7] https://www.flatcar.org/releases/#release-2983.2.0
The text was updated successfully, but these errors were encountered: