Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apiserver namespace labels #2162

Merged
merged 4 commits into from
Jan 14, 2021

Conversation

jayunit100
Copy link
Member

thx to @rikatz @thockin , ok ! here we go... lets see if we can workaround the issues so far in https://github.com/kubernetes/enhancements/pull/2113/files by adding default labels.

Fixes #2161

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 22, 2020
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 22, 2020
@jayunit100
Copy link
Member Author

/assign @liggitt

@jayunit100
Copy link
Member Author

/assign @thockin

@rikatz rikatz force-pushed the apiserver-namespace-labels branch from e62ee5b to b4dce50 Compare November 22, 2020 20:21
@rikatz
Copy link
Contributor

rikatz commented Nov 22, 2020

Made some suggested corrections.

The forcepush / squash was only because I've made some mess with kep.yaml...

@jayunit100
Copy link
Member Author

jayunit100 commented Nov 22, 2020

cool thanks ricardo. added the table that was deleted back.updated some other stuff as well.

@jayunit100
Copy link
Member Author

@andrewsykim FYI ! here it is :)

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am overall happy - still a number of sections to fill out and boilerplate to remove :)

@jayunit100
Copy link
Member Author

will get boilerplate cleaned up later today, thx again everyone.

@thockin
Copy link
Member

thockin commented Nov 23, 2020

How many years of "why can't we" will be solved by 12 LOC.


If this label is missing, it is added by the apiserver

If the label is incorrect or changed, the apiserver fails to validate the object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is probably a remote and maybe even unlikely possibility, what happens to namespaces, prior to upgrade, who had same label key(whatever we choose) but a different value.. these namespaces will now be invalidated which previously worked fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question -- we own kubernetes.io/ prefixed labels so I don't think we are violating any compatibility rules if a user was using it for some other use-case outside of what Kubernetes is managing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Andrew. I am aware of this. However we don't validate against it so it remains a possibility. Only reason i raised this point is to clarify the approach that will be taken for such a case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point -- we should at least write down what the expected behavior for that user would be. It sounds like worse case scenarios is we disallow any future writes to that namespace object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why we should have one alpha release where we warn if we see people using it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, I had two thoughts here:

  1. After this feature is enabled, will label selectors work for Namespaces, or will there need to be a read-write cycle on each Namespace object (like with CRDs for version upgrades)?

  2. Would it make sense to use/reserve metadata.kubernetes.io/ as a prefix for this kind of usage (reflecting field values to labels) without any promises of future reflections?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. After this feature is enabled, will label selectors work for Namespaces, or will there need to be a read-write cycle on each Namespace object (like with CRDs for version upgrades)?

this would work on read, no write cycle required

2. Would it make sense to use/reserve metadata.kubernetes.io/ as a prefix for this kind of usage (reflecting field values to labels) without any promises of future reflections?

I'd tend to keep the metadata.name construct together rather than splitting it

Copy link
Member Author

@jayunit100 jayunit100 Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I LOVE idea # 2 because @deads2k gently suggested that, why not reflect other scalar api fields. if we reserve metadata.kubernetes.io, then we actually pave the way for other fields later (without forcing that to be part of this KEP).

I THNK @liggitt is suggesting (1) (add instantly so that its extremely reliable)

Copy link
Member Author

@jayunit100 jayunit100 Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THREAD REBOOT :)

@thockin: why we should have an alpha release, to avoid collisions
@liggit: ok to risk collide but force invariance .... make use of kube.io labels w/o an intermediate release
         strongly force the label value
          - add if absent
          - prepareForCreate/PrepareForUpdate ~ override wrong values
          - Decorate: replace differeing values
          - labels on READ are fine, no need for a stateful versiony one-off update
@thockin is ok with ^, but "Can I select on labels that only are set in Decorator?"
@evan reserve `metadata label` + also wants "add if absent"

ONLY REMAINING QUESTION == WHERE TO DO THE STOMPING (1) defaults.go or (2) decorator

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re name: <something.>kubernetes.io/metadata.name seems best - extends well and maps to field-selector syntax. The sentiment on "<something.>" seems to be "", which I am fine with.

Re alpha: I am not convinced - I still think a cycle of warning is "the polite thing to do". After that, I am OK with unilateral stomping.

Re where: I tested defaults and it seems to work. I tested Decorator and it does NOT seem to work. This actually seems like a problem to me, but not one for today.

@thockin
Copy link
Member

thockin commented Nov 23, 2020 via email

@thockin
Copy link
Member

thockin commented Nov 24, 2020 via email

@jayunit100 jayunit100 force-pushed the apiserver-namespace-labels branch from 0c81b71 to e98756e Compare November 24, 2020 17:23
@robscott
Copy link
Member

Thanks for this KEP @jayunit100! This would be a huge win for service-apis. We've been heavily reliant on selectors and had been thinking we'd eventually have to add support for a list of namespace names in some places to get around the limitations of selectors.

@robscott
Copy link
Member

/cc

@jayunit100
Copy link
Member Author

Thanks mainly to @thockin for really getting this concept figured out :) wouldn't have had any idea to start otherwise.

@jayunit100
Copy link
Member Author

Pushed some boilerplate cleanup

Liggit said he'll check on Wednesday. So we should have some daylight by then to see what happens next @robscott . If anything else to discuss feel free to ping on slack will be around today and tomorrow .

Hope to have this "stable" by end of "week".


* Every namespace gets bigger in size. We think this is inconsequential
* Some user may already be using that label already. This is mitigated by the
fact that the whole "kubernetes.io" label prefix is reserved. However, since there
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Find the link to where we reserved them all. These are pretty clearly "ours" and I remember writing a doc claiming them and a large chuck of namespaces at the same time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update the doc with this link https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ which explicitly calls this out.

@deads2k
Copy link
Contributor

deads2k commented Nov 25, 2020

We should figure out who feels they "own" namespaces logic and who feels like they would want input on this. I'll ping sig-arch

Given that it's the grouping primitive for the kube-apiserver I would expect sig-apimachinery. We've also talked about namespace related topics like initialization.

/cc @lavalamp

namespace/pod combination, the policy implementer does something **much** more
promiscuous.

### Improve NetworkPolicy API and deprecate old fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems worth exploring, with a discriminated union, you could extend this selection criteria in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. For this specific case we thought the more general solution described here would be best since it covers other users of NS selectors.

@deads2k
Copy link
Contributor

deads2k commented Jan 13, 2021

Looking over the proposal, you're trying to add a label selector that could only be used for =, ==, in operations?

I think I could see not in being a differentiator in favor of creating a label like this.

Could you try to reduce the size of motivation to something more end user API focused and less focused on why you think a default label is the right solution? The information there could be in a different section about "why label selector"

Past me (#2162 (comment)) re-convinced current me that this adds useful power. I think clients may have some challenge trying to figure out when it is safe to use this, but obviously that eases over time.

I'm also ok with using defaulters. I may have some interest in forcing the label to be empty or match a release ahead of beta.

@lavalamp I'm ok with this proposal, but it's novel enough to deserve an ack from you as well.

@andrewsykim
Copy link
Member

Past me (#2162 (comment)) re-convinced current me that this adds useful power. I think clients may have some challenge trying to figure out when it is safe to use this, but obviously that eases over time.

I think we've also discussed building a light-weight backfill controller or admission webhook for older clusters that might want this functionality sooner. Not sure it's worth the trouble but definitely an option and there are folks willing to put the effort in.

@lavalamp
Copy link
Member

I'm ok with this proposal, but it's novel enough to deserve an ack from you as well.

Yeah this isn't very pretty but the effort / reward ratio is there, so I'm ok with this.

@jayunit100
Copy link
Member Author

hiya @deads2k , finished addressing your comments. With regard to exploring newer API structures, I didn't really do anything fo that comment because that is being explored in the sig-net-policy subproject already... hope that's ok :) ? otherwise let me know if anything else is needed here

thanks

@deads2k
Copy link
Contributor

deads2k commented Jan 14, 2021

I think we've also discussed building a light-weight backfill controller or admission webhook for older clusters that might want this functionality sooner. Not sure it's worth the trouble but definitely an option and there are folks willing to put the effort in.

A backfill controller won't have the same guarantees that this mechanism provides. In previous releases, it will still be possible to lie.

namespace and thus send traffic to a pod belonging to this namespace. Matching a namespace by its name, on the other
hand, is a more reliable way to whitelist namespaces as it's much easier to specifically allow only namespaces user
control since user knows their own namespace names.
to namespace resource. In case a namespace's labels are known by others, any user with wrote access to a namespace can add labels at any time to said namespace, and thus send traffic to a pod belonging to this namespace.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the kube default policy, is there any user besides a cluster-admin who can do this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked to a bunch of folks about this and, whether it's allowed by default or not, a lot of installations DO allow users to mutate their own namespaces. Absent some ACL'ed labels, this represents an unrecoverable gap in things like cross-namespace network policy.

@@ -136,6 +135,7 @@ able to select a single namespace to enforce those controllers. Similarly, it is
users to remember the names of the namespaces rather than labels associated with those namespaces while defining
NetworkPolicies or writing admission controller configurations.

Thus, regardless of wether such a user story is valid, it is clearly 'simpler' to implement (as are other similar stories) when a default mechanism for namespace selection is available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think validity matters. Things that aren't recommended don't need to be easy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear if this is a comment on the text or a comment on the design? I.e. do you seek changes to address this comment or is it just a remark?

@@ -381,7 +381,7 @@ need for new API semantics in the networkpolicy specification, which is already

For example, you may need a way to declare namespaceAsName selectors, which is distinct from namespace label selectors.
Introducing a new field has its own drawbacks: The obvious API, when combined with a podSelector, decreases security
of a pod - an old client would be more open than users would expect them to be.
of a pod - an old client would be more open than users would expect them to be (in the most obvious implementations). We note that there are workarounds involving special "always off" selectors, which can avoid this scenario, but they come with their own obvious inherent technical debt and costs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

own obvious inherent technical debt and costs

this is fairly opinionated. I don't know that I see significant technical debt here.

@deads2k
Copy link
Contributor

deads2k commented Jan 14, 2021

I don't agree with the clearly's and obviously's, but I can see value here in specifying not-in values.

/lgtm
/approve

/hold

holding for a squash and to give @thockin another shot at review if he'd like it. if I forget, remind me to release the hold in a week or so.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 14, 2021
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 14, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, jayunit100, liggitt, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 14, 2021

### Feature Enablement and Rollback

1.21:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alphas are off by default.


* **How can an operator determine if the feature is in use by workloads?**

By running `kubectl get ns --show-labels` and inspecting the `kubernetes.io` values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't quite right. You'd need to track list/watch requests keying against this name.

@jayunit100
Copy link
Member Author

thanks @deads2k ! is it Ok to keep 3 commits separate since there were different authors (me, Ricardo, abhishek) (and I squash the 4th) ?


* **What are the SLIs (Service Level Indicators) an operator can use to determine**

Since this just an immutable field, we don't expect any such indicators to be of relevance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a metric adding the number of rejections for invalid use of the label is fairly reasonable. That would be unexpected client behavior even though it ought not happen.

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am good with this as-is. Some small followups and it can go to implementable.

/remove hold

owning-sig: sig-api-machinery
participating-sigs:
- sig-network
status: provisional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we missing to move to implementable?

namespace and thus send traffic to a pod belonging to this namespace. Matching a namespace by its name, on the other
hand, is a more reliable way to whitelist namespaces as it's much easier to specifically allow only namespaces user
control since user knows their own namespace names.
to namespace resource. In case a namespace's labels are known by others, any user with wrote access to a namespace can add labels at any time to said namespace, and thus send traffic to a pod belonging to this namespace.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked to a bunch of folks about this and, whether it's allowed by default or not, a lot of installations DO allow users to mutate their own namespaces. Absent some ACL'ed labels, this represents an unrecoverable gap in things like cross-namespace network policy.

@@ -136,6 +135,7 @@ able to select a single namespace to enforce those controllers. Similarly, it is
users to remember the names of the namespaces rather than labels associated with those namespaces while defining
NetworkPolicies or writing admission controller configurations.

Thus, regardless of wether such a user story is valid, it is clearly 'simpler' to implement (as are other similar stories) when a default mechanism for namespace selection is available.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear if this is a comment on the text or a comment on the design? I.e. do you seek changes to address this comment or is it just a remark?

* Some user may already be using that label already. This is mitigated by the
fact that the whole "kubernetes.io" label prefix is [reserved](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set). However, since there
is no validation of reserved label prefixes, an upgrade would mutate such a label (this is not a major concern).
* any clients/applications expecting this label (i.e. for a NetworkPolicy), wouldnt see it, and thus they might not be able to find namespaces using this label technique. In the case of network policies, this would result in traffic being blocked due to unavailable namespace labels.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line doesn't make sens to me @jayunit100 - can you clarify, perhaps as followup?

is no validation of reserved label prefixes, an upgrade would mutate such a label (this is not a major concern).
* any clients/applications expecting this label (i.e. for a NetworkPolicy), wouldnt see it, and thus they might not be able to find namespaces using this label technique. In the case of network policies, this would result in traffic being blocked due to unavailable namespace labels.

For this reason, we can do one alpha release where we log/event if we observe usage of this key that is not compatible.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to do an alpha or not? I thought not.


This can be implemented by modifying the defaults.go and validation.go components for the namespace API, i.e., in the defaults.go file for api/core:

<<[UNRESOLVED See https://github.com/kubernetes/kubernetes/pull/96968 where this evolves, but this example is functionally valid, modulo warnings ]>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of the discussion is that this is resolved. Setting the value should be unconditional. It's our label. No need to preserve the value.

The only open question is whether to do it silently in defaults or with a warning in strategy. The warning seems appropriate to me, but we'll need to cover all create/update/get paths, which is slightly more complicated than defaults.go

If we don't do the warning, we can do it in defaults and just stomp the value.

I'm advocating for the warning, but only weakly.


1.22:
- gate enabled by default
- remove gate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strictly speaking this would lock the gate, and some releases later we would remove it.

namespace/pod combination, the policy implementer does something **much** more
promiscuous.

### Improve NetworkPolicy API and deprecate old fields
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. For this specific case we thought the more general solution described here would be best since it covers other users of NS selectors.

@thockin thockin removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 14, 2021
replaces:
stage: alpha
latest-milestone: "v1.21"
milestone:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic "metadata.name" labels for all Namespaces