Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libct/cg/fs2: fix GetStats for unsupported hugetlb #3233

Merged
merged 1 commit into from
Oct 12, 2021

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Oct 6, 2021

In case hugetlb is not supported, GetStats() should not error out,
and yet it does.

Assume that if GetHugePageSize return an error, hugetlb is
not supported (this is what cgroup v1 manager do).

NOTE this code is not used by the runc itself, only by external users
like kubernetes and cadvisor that import libcontainer/cgroup/...

Fixes: 89a87ad
Fixes: #3232

1.0 backport: #3295

In case hugetlb is not supported, GetStats() should not error out,
and yet it does.

Assume that if GetHugePageSize return an error, hugetlb is
not supported (this is what cgroup v1 manager do).

Fixes: 89a87ad
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
if err != nil {
return fmt.Errorf("failed to fetch hugetlb info: %w", err)
}
hugePageSizes, _ := cgroups.GetHugePageSize()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment to clarify that we are ignoring err on purpose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of an intermediate commit, merely fixing the issue. Please see #3234 for a complete solution -- in where I remove the returned error altogether (as no one is using it), so there's no need to ignore it (and explain why).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AkihiroSuda AkihiroSuda merged commit 2357eab into opencontainers:master Oct 12, 2021
@ctrahey
Copy link

ctrahey commented Nov 4, 2021

Thanks! What is the release timeline?

@luispabon
Copy link

Sorry to be a nag, but each of my boxes is generating daily ~4GB in logs. Any chance for a release soon?

@Letme
Copy link

Letme commented Nov 23, 2021

I am also very much interested in release as my RPI SD cards die a lot faster with this amount of errors

@nodesocket
Copy link

nodesocket commented Nov 28, 2021

👍🏻 also would love this released as I seeing my kubernetes logs in /var/log/syslog spammed with these errors.

@AkihiroSuda
Copy link
Member

Cherry-picking to v1.0.3
#3295

@AkihiroSuda AkihiroSuda added this to the 1.1.0 milestone Nov 29, 2021
@kolyshkin kolyshkin added backport/1.0-done A PR in main branch which has been backported to release-1.0 and removed backport/1.0-todo A PR in main branch which needs to be backported to release-1.0 labels Nov 30, 2021
@nodesocket
Copy link

nodesocket commented Nov 30, 2021

Once v1.0.3 is released, what's the upgrade process? I am running Debian 11 (Bullseye) on aarch64. Can I simply run sudo apt-get -y install runc?

@kolyshkin
Copy link
Contributor Author

Once v1.0.3 is released, what's the upgrade process? I am running Debian 11 (Bullseye) on aarch64. Can I simply run sudo apt-get -y install runc?

This depends on your distro and its maintainers. I am using Fedora, the runc package is being updated regularly, following the upstream (i.e. this repo).

In this case, though, it does not matter which version of runc binary is installed on the system. What matters is which version of runc's libcontainer k8s (or k3s, not sure which one are you using) is built against. It seems that both k8s and k3s are tracking upstream (this repo) closely, so chances are high that after we'll release 1.0.3 they will rebase.

@kolyshkin kolyshkin deleted the hugepage-fix branch November 30, 2021 01:54
@nodesocket
Copy link

nodesocket commented Nov 30, 2021

@kolyshkin thanks for the detailed reply.

I am using Kubernetes version 1.22.4. Does this mean I'll have to upgrade the Kubernetes version once they rebase runc v1.0.3 in?

pi@kube-master:~ $ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-17T15:48:33Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/arm64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-17T15:42:41Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/arm64"}

@nodesocket
Copy link

@kolyshkin any response to my previous question? To get this fix, will I have to upgrade my Kubernetes version from 1.22.4 and wait for Kubernetes to rollout a new release?

@luispabon
Copy link

100% dependent on how you're deploying your kubernetes.

k3s for instance bundles the runc binary within itself, so you do need them to release on their end. Who knows how GKE or EKS nodes are built internally. Etc.

@nodesocket
Copy link

nodesocket commented Dec 8, 2021

@luispabon deployed Kubernetes using kubeadm. Not using GKE or EKS for this deployment. On-prem deploy.

cat << EOF > /tmp/kubernetes-master-config.yaml >/dev/null
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: 1.23.0
networking:
  podSubnet: "10.244.0.0/16"
EOF
sudo kubeadm init --config /tmp/kubernetes-master-config.yaml

@thaJeztah
Copy link
Member

100% dependent on how you're deploying your kubernetes.

k3s for instance bundles the runc binary within itself, so you do need them to release on their end. Who knows how GKE or EKS nodes are built internally. Etc.

Per earlier comment of @kolyshkin, it's likely not the runc binary that makes a difference here, but the go module (build-time dependency) used to build kubernetes; looks like the master branch of kubernetes is still on v1.0.2 currently; https://github.com/kubernetes/kubernetes/blob/a8e06cf2bfd2ecdde0f2f49af4df29f70e07f20e/go.mod#L63

github.com/opencontainers/runc v1.0.2

pacoxu pushed a commit to pacoxu/kubernetes that referenced this pull request Dec 10, 2021
fix GetStats for unsupported hugetlb needed to run on RaspberryPi4 with non-hugetlb compiled kernel (standard). This includes the opencontainers/runc#3233

Signed-off-by: Paco Xu <paco.xu@daocloud.io>
Letme added a commit to Letme/kubernetes that referenced this pull request Dec 13, 2021
fix GetStats for unsupported hugetlb needed to run on RaspberryPi4 with non-hugetlb compiled kernel (standard). This includes the opencontainers/runc#3233

Used commands from hack folder to generate the new dependency:

```
hack/pin-dependency.sh github.com/opencontainers/runc v1.0.3
hack/update-vendor.sh

hack/lint-dependencies.sh
```
Letme added a commit to Letme/kubernetes that referenced this pull request Feb 11, 2022
fix GetStats for unsupported hugetlb needed to run on RaspberryPi4 with non-hugetlb compiled kernel (standard). This includes the opencontainers/runc#3233

Used commands from hack folder to generate the new dependency:

```
hack/pin-dependency.sh github.com/opencontainers/runc v1.0.3
hack/update-vendor.sh

hack/lint-dependencies.sh
```
Letme added a commit to Letme/kubernetes that referenced this pull request Feb 11, 2022
fix GetStats for unsupported hugetlb needed to run on RaspberryPi4 with non-hugetlb compiled kernel (standard). This includes the opencontainers/runc#3233

Used commands from hack folder to generate the new dependency:

```
hack/pin-dependency.sh github.com/opencontainers/runc v1.0.3
hack/update-vendor.sh

hack/lint-dependencies.sh
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cgroupv2 backport/1.0-done A PR in main branch which has been backported to release-1.0 easy-to-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing hugetlb support leads to many log messages
7 participants