Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate the use of CRD sub-resources. #643

Closed
mattmoor opened this issue Apr 13, 2018 · 19 comments
Closed

Investigate the use of CRD sub-resources. #643

mattmoor opened this issue Apr 13, 2018 · 19 comments
Assignees
Labels
area/API API objects and controllers kind/feature Well-understood/specified features, ready for coding.
Milestone

Comments

@mattmoor
Copy link
Member

IIRC K8s 1.10 added support for "status" sub-resources in CRDs.

There are a variety of places this is useful to us, including updateStatus in each of the controllers, and (likely) the validation logic in the webhook.

Let's scout this functionality in M4, and if useful adopt.

See also this issue, which tracks a 1.10 update.

@mattmoor mattmoor added kind/feature Well-understood/specified features, ready for coding. area/API API objects and controllers labels Apr 13, 2018
@mattmoor mattmoor added this to the M4 milestone Apr 13, 2018
@grantr
Copy link
Contributor

grantr commented Apr 13, 2018

I also think the Scale sub-resource would be a useful addition to revisions as a standard way for autoscalers to effect scaling. This gives the revision controller the option of modifying or ignoring the request, unlike the alternative of autoscaling the deployment directly.

@mattmoor
Copy link
Member Author

Yes perhaps, although I'm not sure the form that takes is appropriate for us?

I think the payload is a single integer (foggy recollection of things people told me), which feels like the kind of thing we have been avoiding in the spec, since that feels like "how many servers" and we're after "serverless".

That said, I could see this being useful to force Reserve -> Active with some minimum "concurrent request" capacity in anticipation of load spikes beyond our capacity to hyperscale up (e.g. if more cluster capacity is needed).

cc @josephburnett for this discussion, but let's track that with a separate issue since it involves a bit of design and specification.

@grantr
Copy link
Contributor

grantr commented Apr 13, 2018

It's less about users and more about implementors of autoscaling strategies. But you're right, off-topic, plus @josephburnett is out until Tuesday and I don't want to start any official discussions without him. 😃 This document in the team drive has more details if anyone is interested.

@mattmoor
Copy link
Member Author

/assign @grantr

@mattmoor mattmoor modified the milestones: M4, M5 Apr 18, 2018
@mattmoor
Copy link
Member Author

Moving to M5, since the GKE 1.10 alpha clusters have been showing some issues. I don't think the importance of this has diminished, but I'm less optimistic that we'll be able to make the switch to 1.10 smoothly in M4.

@mattmoor
Copy link
Member Author

/assign @dprotaso @bsnchan

Since they have a PR. I believe this is blocked on 1.10, which is coming in hot for M5 (it is GA in GKE, but some of our monitoring stuff has issues with it). I'm not particularly optimistic that this will land.

@mattmoor mattmoor modified the milestones: M5, M6 May 24, 2018
@sslavic
Copy link

sslavic commented Jun 28, 2018

If I understood correctly, CRD with subresources is in v1beta1 in k8s 1.11 https://kubernetes.io/blog/2018/06/27/kubernetes-1.11-release-announcement/

@mattmoor
Copy link
Member Author

mattmoor commented Nov 6, 2018

@dprotaso @bsnchan This should be unblocked. GKE has 1.11, so we can actually test with sub-resources 🎉

@dprotaso
Copy link
Member

dprotaso commented Jan 2, 2019

Here's a backwards compatible strategy for adopting CRD status subresource with the aim of eventually dropping the generation bumping that's occurring in the webhook.

Phase 1 - Serving 0.3

Changes

Enable status subresource on all CRDs

  • This causes the API server to start bumping a resource's metadata.generation (this is actually the default in K8s 1.13 I believe).

  • Our controllers can and will now use the /status endpoint. This prevents them from 'accidentally' updating the reconciled object's spec

Drop our functional dependency on spec.generation

  • Don't depend on it for garbage collection - Don't use spec.generation when garbage collecting revisions #2765
  • Mark functions using spec.generation as deprecated ie. Configuration controller resourcenames.Revision
  • The configuration controller now adds a revision label based on a config's metadata.generation
  • The configuration controller uses this new label to determine the 'LatestCreatedRevision'
    • Fallback on querying using the revision name (based on spec.generation)
    • When falling back add the new label based on a configuration's metadata.generation (this is essentially a migration of the latest created revision)

Implications

  • Adding a new label implies we are now modifying a subset of revisions. This requires a change in the revision reconciler since all labels were propagated to the revision's deployment's selector
  • Operators will need to reconcile all Configurations prior to upgrading to the next release (ie. 0.4)

Phase 2 - Serving 0.4

Changes

Fully drop our dependency on spec.generation

  • Remove Webhook logic for incrementing Spec.Generation
  • Change the naming strategy of revisions & builds
    • Delete code old naming scheme code

Remove Spec.Generation from the API (under consideration)

  • This was never really a user facing detail that could be controlled
  • We should ensure if it appears in a spec it's a noop

Change the label applied to revisions from configurationMetadataGeneration back to configurationGeneration

  • configurationGeneration became 'unavailable' during 0.3 but should now be able to be reclaimed
  • this would require falling back onto configurationMetadataGeneration and performing another migration

Implications

  • Changing the revision naming scheme will be potentially confusing for users

Phase 3 - Serving 0.5

Changes

Drop functional dependency on configurationMetadataGeneration

Phase 4 - Serving 0.6

Changes

No longer apply the configurationMetadataGeneration label to revisions

Phase X - when conversion webhooks are beta

Changes

  • Should be able to drop Spec.Generation in the API officially by creating a new CRD version
  • Webhooks drop Spec.Generation when we upgrade the CRD to the new version

Implications

  • Technically 'round-tripability' isn't possible if we drop the property. Unless in Phase 2 we have the webhook force spec.generation to always be 0

dprotaso added a commit to dprotaso/pkg that referenced this issue Jan 21, 2019
With Kubernetes 1.11+ metadata.generation now increments properly
when the status subresource is enabled on CRDs

For more details see: knative/serving#643
dprotaso added a commit to dprotaso/pkg that referenced this issue Feb 14, 2019
With Kubernetes 1.11+ metadata.generation now increments properly
when the status subresource is enabled on CRDs

For more details see: knative/serving#643
knative-prow-robot pushed a commit to knative/pkg that referenced this issue Feb 14, 2019
* Drop webhook logic to increment spec.generation

With Kubernetes 1.11+ metadata.generation now increments properly
when the status subresource is enabled on CRDs

For more details see: knative/serving#643

* Drop the generational duck type
@mattmoor
Copy link
Member Author

0.4 pieces have landed, moving out.

@mattmoor mattmoor modified the milestones: Serving 0.4, Serving 0.5 Feb 14, 2019
joshrider added a commit to joshrider/serving that referenced this issue Feb 26, 2019
Addresses knative#643

In 0.4, Revisions have both `/configurationMetadataGeneration` and
`/configurationGeneration` labels have a value equal to the
Configuration's metadata.generation

This commit has the Configuration's reconciler use
`/configurationGeneration` when looking up the latest created
revision

We should be able to drop the use of `/configurationMetadataGeneration`
in 0.6
joshrider added a commit to joshrider/serving that referenced this issue Feb 27, 2019
Addresses knative#643

In 0.4, Revisions had generation labels migrated toward use of
`/configurationGeneration` label to determine a Configuration's
latest created Revision.

This commit leaves the deprecated `/configurationMetadataGeneration`
label intact to allow rollback to 0.4, but removes the migration
of the deprecated label, along with the deletion of an annotation
previously used for the same purpose.

Functional dependency on `/configurationMetadataGeneration` is dropped
in knative#3325.
@mattmoor
Copy link
Member Author

/milestone Serving 0.6
PRs for 0.5 scope are in 0.5 now, so moving this to 0.6 to track remaining work.

knative-prow-robot pushed a commit that referenced this issue Mar 20, 2019
…#3325)

Addresses #643

In 0.4, Revisions have both `/configurationMetadataGeneration` and
`/configurationGeneration` labels have a value equal to the
Configuration's metadata.generation

This commit has the Configuration's reconciler use
`/configurationGeneration` when looking up the latest created
revision

We should be able to drop the use of `/configurationMetadataGeneration`
in 0.6
knative-prow-robot pushed a commit that referenced this issue Mar 25, 2019
Addresses #643

In 0.4, Revisions had generation labels migrated toward use of
`/configurationGeneration` label to determine a Configuration's
latest created Revision.

This commit leaves the deprecated `/configurationMetadataGeneration`
label intact to allow rollback to 0.4, but removes the migration
of the deprecated label, along with the deletion of an annotation
previously used for the same purpose.

Functional dependency on `/configurationMetadataGeneration` is dropped
in #3325.
@mattmoor
Copy link
Member Author

@dprotaso Do you plan to do the 0.6 portion of this?

@dprotaso
Copy link
Member

@dprotaso Do you plan to do the 0.6 portion of this?

Oh yeah

@dprotaso
Copy link
Member

dprotaso commented May 6, 2019

/close

@knative-prow-robot
Copy link
Contributor

@dprotaso: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dprotaso
Copy link
Member

dprotaso commented May 6, 2019

it is done.

nak3 added a commit to nak3/serving that referenced this issue Jan 25, 2021
nak3 added a commit to nak3/serving that referenced this issue Jan 25, 2021
This is not mandatory but it's been a while since Go was available
in CI so it bumps golang to 1.15.
nak3 added a commit to nak3/serving that referenced this issue Jan 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/API API objects and controllers kind/feature Well-understood/specified features, ready for coding.
Projects
None yet
Development

No branches or pull requests

6 participants