Skip to content
This repository has been archived by the owner on Jul 30, 2021. It is now read-only.

hack/quickstart: enable setting kubelet cloud provider separately from controller manager #804

Closed
wants to merge 1 commit into from

Conversation

klausenbusk
Copy link
Contributor

Running a cloud-controller-manager require kubelet to be started
with --cloud-provider=external and kube-apiserver and
kube-controller-manager without the --cloud-provider flag or a
empty string [1].
This commit make it possible to specific --cloud-provider for
kubelet without specifying a cloud-provider for kube-apiserver
and kube-controller-manager with the new env option
CLOUD_PROVIDER_KUBELET, the option use CLOUD_PROVIDER as fallback
if not set.

[1] https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller/#running-cloud-controller-manager


Not tested yet.

As this is the future, I think it make sense to add a extra option.

@coreosbot
Copy link

Can one of the admins verify this patch?

3 similar comments
@coreosbot
Copy link

Can one of the admins verify this patch?

@coreosbot
Copy link

Can one of the admins verify this patch?

@coreosbot
Copy link

Can one of the admins verify this patch?

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 13, 2017
@colemickens
Copy link
Contributor

Doesn't the flag need to be updated on KCM (and apiserver, though it's not used for anything) as well?

@klausenbusk
Copy link
Contributor Author

Doesn't the flag need to be updated on KCM (and apiserver, though it's not used for anything) as well?

Not per the doc: https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller/#running-cloud-controller-manager

kube-apiserver and kube-controller-manager MUST NOT specify the --cloud-provider flag. This ensures that it does not run any cloud specific loops that would be run by cloud controller manager. In the future, this flag will be deprecated and removed.
kubelet must run with --cloud-provider=external. This is to ensure that the kubelet is aware that it must be initialized by the cloud controller manager before it is scheduled any work.
kube-apiserver SHOULD NOT run the PersistentVolumeLabel admission controller since the cloud controller manager takes over labeling persistent volumes. To prevent the PersistentVolumeLabel admission plugin from running, make sure the kube-apiserver has a --admission-control flag with a value that does not include PersistentVolumeLabel.
For the cloud-controller-manager to label persistent volumes, initializers will need to be enabled and an InitializerConifguration needs to be added to the system. Follow these instructions to enable initializers. Use the following YAML to create the InitializerConfiguration:

@colemickens
Copy link
Contributor

Very interesting. I've seen it working with apiserver and kcm specifying external but the docs are clear and the other usage of CCM strips off the --cloud-provider flags from KCM and apiserver, which also supports what you're saying.

Let me touch base with the sig to make sure this text is still accurate. For some reason I thought there was discussion of using --cloud-provider=external on KCM to change the behavior of volumes, but that might require another flag to get KCM to do external mode, but still provide azure|gce|aws disk attach support.

I guess you're not stripping the flag, so it's expected that the user then set export CLOUD_PROVIDER= and export CLOUD_PROVIDER_KUBELET=external ?

@klausenbusk
Copy link
Contributor Author

I guess you're not stripping the flag, so it's expected that the user then set export CLOUD_PROVIDER= and export CLOUD_PROVIDER_KUBELET=external?

CLOUD_PROVIDER default to "", so setting CLOUD_PROVIDER_KUBELET should be enough.

@klausenbusk
Copy link
Contributor Author

I've seen it working with apiserver and kcm specifying external

The text was changed in 1.8 I think, please keep me updated I'm also a bit confused.

@colemickens
Copy link
Contributor

Looks like the PR I'm thinking of is here: kubernetes/kubernetes#52371

For now, let's go ahead and leave this as-is. When this PR merges and 1.10 is out, we can revisit and activate the extra flag to keep volumes working until CCM+CSI is ready and the flag is deprecated again.

Source for the info, @dims, from Slack. Thanks @dims! For historical sake:


colemickens [1:55 PM]
is assuming the docs are still correct that kcm/apiserver are to be run with no --cloud-provider flag.

colemickens
[1:55 PM]
I think there was discussion of changing KCM so it could run controller loops, but IIRC that would be via a new flag and isn't there yet.

dims [2:18 PM]
@colemickens this one is merging. was that the one you were thinking of? kubernetes/kubernetes#52371

colemickens [2:23 PM]
Yes, that's exactly what I'm looking for. We have a PR in bootkube that doesn't set the --cloud-provider flag on KCM and I was thinking we wanted to set it to external so that we could eventually utilize this.

[2:23]
However there is text that says to explicitly NOT set --cloud-provider on apiserver/kcm today.

[2:23]
So that will probably need to get updated once that PR merges. (edited)

dims [2:24 PM]
@colemickens right. you should only set --cloud-provider=external if you really know you are using / need it. else don’t set it

@colemickens
Copy link
Contributor

coreosbot test this please

Copy link
Contributor

@ericchiang ericchiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ericchiang
Copy link
Contributor

coreosbot test this please

@coreosbot
Copy link

Can one of the admins verify this patch?

@dghubble
Copy link
Contributor

Has anyone set this up? Where are the materials on it? It's still listed as untested. Is the understanding that this flag is to enable experimenting with CCM?

@ericchiang
Copy link
Contributor

Has anyone set this up? Where are the materials on it? It's still listed as untested. Is the understanding that this flag is to enable experimenting with CCM?

CCM is still experimental, I don't think we would accept a PR to add it to bootkube yet.

This PR just lets you set the kubelet cloud controller separately from the controller manager. It seems fine in that respect.

@dghubble
Copy link
Contributor

Let's not have a title like "Support CCM (Cloud Controller Manager)" then.

@dghubble
Copy link
Contributor

Otherwise, no complaints with allowing it in the hack scripts to facilitate experiments / adoption.

@colemickens colemickens changed the title hack/quickstart: Support CCM (Cloud Controller Manager) hack/quickstart: kubelet can specify a different cloud-provider (allow CCM testing) Jan 22, 2018
@ericchiang ericchiang changed the title hack/quickstart: kubelet can specify a different cloud-provider (allow CCM testing) hack/quickstart: enable setting kubelet cloud provider separately from controller manager Jan 22, 2018
@colemickens
Copy link
Contributor

colemickens commented Jan 22, 2018

Updated the PR title. Will amend the commit and push as well. (edit: Got confused, thought this was my CCM PR, but it's not. I had similar changes for tectonic-installer...)

@klausenbusk
Copy link
Contributor Author

So I just did a quick testing (after rebasing) of this and it isn't working. When kubelet is started with --cloud-provider=external, status.podIP is set to `` and bootstrap-apiserver then can't start.

…m controller manager

Running a cloud-controller-manager require kubelet to be started
with `--cloud-provider=external` and `kube-apiserver` and
`kube-controller-manager` without the `--cloud-provider` flag or a
empty string [1].
This commit make it possible to specific `--cloud-provider` for
`kubelet` without specifying a cloud-provider for `kube-apiserver`
and `kube-controller-manager` with the new env option
CLOUD_PROVIDER_KUBELET, the option use CLOUD_PROVIDER as fallback
if not set.

[1] https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller/#running-cloud-controller-manager
@klausenbusk
Copy link
Contributor Author

klausenbusk commented Jan 23, 2018

So I just did a quick testing (after rebasing) of this and it isn't working. When kubelet is started with --cloud-provider=external, status.podIP is set to `` and bootstrap-apiserver then can't start.

Someone has already reported this upstream: kubernetes/kubernetes#54965 (but the "bug" was later closed), so at the moment it seems we end up in a chicken-egg situation no matter what we do.

@klausenbusk
Copy link
Contributor Author

so at the moment it seems we end up in a chicken-egg situation no matter what we do.

The help page seems to be incorrect at the moment (a --advertise-address blank should work), I have opened a issue upstream: kubernetes/kubernetes#58686 .

Let set this on hold until then.
/hold

@klausenbusk
Copy link
Contributor Author

Let set this on hold until then.
/hold

Upstream is gonna fix the comment not the behavior.

I'm not sure how we can fix this chicken-egg situation then, --advertise-address was added for a reason.

@colemickens
Copy link
Contributor

@klausenbusk Can you please elaborate? There's confusion about a couple of things:

  • From my reading and understanding, it sounds like --advertise-address= (as in, specified, but blank) does "work", but defaults to all interfaces. (Which is undesirable in bootkube for the reason you linked)

  • Why does CCM/external cloud provider affect this? You're saying that disabling the cloudprovider on kubelet causes the podIP to be empty? Do you know why?

@colemickens
Copy link
Contributor

Also, have you filed an upstream bug for the CCM/podIP behavior, or this general issue? I presume others would be affected by this, if it's a fundamental problem, especially as kubeadm matures and embraces self-hosting.

@klausenbusk
Copy link
Contributor Author

From my reading and understanding, it sounds like --advertise-address= (as in, specified, but blank) does "work", but defaults to all interfaces. (Which is undesirable in bootkube for the reason you linked)

The help text is wrong, a blank --advertise-address= doesn't work (Error: invalid argument "" for --bind-address=: failed to parse IP: ""), upstream is gonna update the help text: kubernetes/kubernetes#58686 (comment)

Why does CCM/external cloud provider affect this? You're saying that disabling the cloudprovider on kubelet causes the podIP to be empty? Do you know why?

It is the CCM job to run the Node controller, which I assume (?) set the hostIP (which is then used by podIP), so until the CCM has initialized the node hostIP is "blank".

Also, have you filed an upstream bug for the CCM/podIP behavior, or this general issue? I presume others would be affected by this, if it's a fundamental problem, especially as kubeadm matures and embraces self-hosting.

See: kubernetes/kubernetes#55633 and kubernetes/kubernetes#50545

@colemickens
Copy link
Contributor

I guess I'm still a bit confused because I had CCM working with bootkube in a tectonic-installer WIP branch, and I know the Azure folks are having successes running CCM as well. Maybe you or I could try to look at it and see where the diff is.

I don't really understand why hostIP needs to be known to know the podIP. The pod and host network are often different anyway (overlay/underlay/physical/etc).

@klausenbusk
Copy link
Contributor Author

I guess I'm still a bit confused because I had CCM working with bootkube in a tectonic-installer WIP branch, and I know the Azure folks are having successes running CCM as well. Maybe you or I could try to look at it and see where the diff is.

Firstly, they don't seems to be using TLS bootstrapping, which isn't supported at the moment with a CCM (kubernetes/kubernetes#55633).

I don't really understand why hostIP needs to be known to know the podIP.

When the pod is started with hostNetwork: true it use hostIP as podIP.

@aaronlevy
Copy link
Contributor

Any updates on this? I'm also fine with the option to specify kubelet cloud provider separately in the hack scripts - but not sure if there were still some outstanding issues.

@coreosbot
Copy link

Can one of the admins verify this patch?

1 similar comment
@coreosbot
Copy link

Can one of the admins verify this patch?

@klausenbusk
Copy link
Contributor Author

  • but not sure if there were still some outstanding issues.

I think it should work now with kubernetes/kubernetes#65594 merged, but I'm not sure and nevertheless we need to wait for 1.12.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants