Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PodSpec.hostAliases has a potentially misleading comment about hostNetwork #122420

Closed
neolit123 opened this issue Dec 20, 2023 · 10 comments · Fixed by #122422
Closed

PodSpec.hostAliases has a potentially misleading comment about hostNetwork #122420

neolit123 opened this issue Dec 20, 2023 · 10 comments · Fixed by #122422
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@neolit123
Copy link
Member

neolit123 commented Dec 20, 2023

What happened?

there is a note above hostAliases that this feature does not work with hostNework=true, yet it seems it does:
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/api/core/v1/types.go#L3672-L3679

This is only valid for non-hostNetwork pods.

when testing with a hostNetwork pod + hostAliases, there are no errors in kubelet logs at --v=10 related to this, in fact i don't see any hostAlias entries at all in there.

i could not find anything blocking this in the kubelet, in fact there is normal handling of hostNetwork:
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_pods.go#L330

i could not find any validation performed on the hostAliases field of PodSpec if hostNetwork is true on the API level.

func ValidateHostAliases(hostAliases []core.HostAlias, fldPath *field.Path) field.ErrorList {

i could not find any unit tests or e2e tests related to this.

What did you expect to happen?

  • the godoc should contain valid information
  • if the feature is not supposed to work, it must be patched to not have the entries in the hosts file (perhaps?)
  • we should have tests that ensure this type of requirement is guaranteed

How can we reproduce it (as minimally and precisely as possible)?

these steps confirm that a static pod with hostNework (deployed by kubeadm) does make use of hostAliases.

  • create a kubeadm single node cluster directly with kubeadm init or one with kind create cluster
  • modify the /etc/kubernetes/manifests/kube-apiserver.yaml to include:
  hostAliases:
  - hostnames:
    - kubernetes
    ip: 127.0.0.1
  • wait some time for the new static pod spec to resync
  • find the PID of the kube-apiserver
ps -ax | grep kube-apiserver
  • get the hosts file
cat /proc/<pid>/root/etc/hosts
# Kubernetes-managed hosts file (host network).
127.0.0.1	localhost
127.0.1.1	lubo-it

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

# Entries added by HostAliases.
127.0.0.1	kubernetes
$ sudo cat /etc/kubernetes/manifests/kube-apiserver.yaml
...
  hostNetwork: true
  hostAliases:
  - hostnames:
    - kubernetes
    ip: 127.0.0.1
...

Anything else we need to know?

n/a

Kubernetes version

latest master 1.30.0-alpha.0

Cloud provider

not applicable

OS version

$ uname -a
Linux lubo-it 5.8.0-38-generic #43~20.04.1-Ubuntu SMP Tue Jan 12 16:39:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.2 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.2 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Install tools

kubeadm

Container runtime (CRI) and version (if applicable)

containerd github.com/containerd/containerd v1.6.14 9ba4b250366a5ddde94bb7c9d1def331423aa323

Related plugins (CNI, CSI, ...) and versions (if applicable)

kindnet


Changes

@neolit123 neolit123 added the kind/bug Categorizes issue or PR as related to a bug. label Dec 20, 2023
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 20, 2023
@neolit123
Copy link
Member Author

/sig node network cluster-lifecycle
/kind documentation

@aojea @chrischdi

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. kind/documentation Categorizes issue or PR as related to documentation. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 20, 2023
@neolit123 neolit123 changed the title PodSpec.hostAliases contain a potentially misleading comment about hostNetwork PodSpec.hostAliases has a potentially misleading comment about hostNetwork Dec 20, 2023
@neolit123
Copy link
Member Author

as reported by @chrischdi the hostAliases sometimes are not mounted for hostNetwork static pods, which is concerning:
kube-vip/kube-vip#692
related slack thread:
https://kubernetes.slack.com/archives/C09QYUH5W/p1703068817039879

@aojea
Copy link
Member

aojea commented Dec 20, 2023

At minimum the comment is wrong I need to be updated and we need e2e tests :/

@neolit123
Copy link
Member Author

found this, support was added a while back:
#50646

i will PR the API docs.

@chrischdi
Copy link
Member

chrischdi commented Dec 20, 2023

We identified an edge case here with Kubernetes v1.29.

If a cloud provider did not yet initialise the node's .status.addresses, the code for creating the /etc/hosts file including the hostAliases does not get run.

To be concrete, in:

mountEtcHostsFile := shouldMountHostsFile(pod, podIPs)

podIPs is empty in this case which leads to mountEtcHostsFile to be false.

I was able to trace it back to this change which leads to this behaviour for static pods:

#121028

For confirmation: I reverted the above commit, rebuilt kubelet and it got back to the old behaviour.

@aojea
Copy link
Member

aojea commented Dec 20, 2023

For confirmation: I reverted the above commit, rebuilt kubelet and it got back to the old behaviour.

This behavior has undesired consequences if the cloud-provider addresses are different than the original ones, specially for hostNetwork pods, that inherit these addresses from the Node.

Since some cloud-providers depend on this behavior, in order to keep backward compatibility, assume that the specifying addresses via the node-ip flags means that the intent is to keep the existing behavior to temporary initialize the addresses.

there is no perfect solution, but reverting is wrong because opens another bug, we need to study this more carefully but in the meantime installers that know the nodes addresses beforehand should set the --node-ip flag or not use cloud-provider external

@neolit123
Copy link
Member Author

yeah, as i mentioned downstream, revering #121028 is out of the question. it fixes some long standing problems.

we might have to document somewhere this limitation in a cloud provider env.

@neolit123
Copy link
Member Author

podIPs is empty in this case which leads to mountEtcHostsFile to be false.

i think it might be possible for us to special case this so that the hostAlias management does not wait for the node IP in a CP env, under certain conditions. just a speculation.

@neolit123
Copy link
Member Author

PRs

add test:

update godoc:

@SergeyKanzhelev
Copy link
Member

/triage accepted

looks like it is being worked on already

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 3, 2024
@kannon92 kannon92 moved this to Done in SIG Node Bugs Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants