-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add and use stream metadata for RHCOS, add openshift-install coreos print-stream-json #4582
Add and use stream metadata for RHCOS, add openshift-install coreos print-stream-json #4582
Conversation
Skipping CI for Draft Pull Request. |
If we still have the OKD fork in 4.8, we'll want this there as well. |
a36299c
to
7c3c28e
Compare
OK rebased 🏄 though only compile tested so far. Let's run this for some platforms: |
Picking this back up, but it will need coreos/coreos-assembler#2052 |
7c3c28e
to
f0b0d1d
Compare
f0b0d1d
to
9007dc1
Compare
/test e2e-aws |
9007dc1
to
65e1b4d
Compare
/test e2e-aws |
OK cool! e2e-libvirt passed, and the other two just failed on some sort of Prometheus flake/issue that is very unlikely related to this. Going to work on cleaning this up and then lift the draft soon. |
65e1b4d
to
73ef6db
Compare
OK lifting draft! |
cdaa105
to
0889bd1
Compare
pkg/asset/cluster/tfvars.go
Outdated
} | ||
data, err := gcptfvars.TFVars( | ||
gcptfvars.TFVarsSources{ | ||
Auth: auth, | ||
MasterConfigs: masterConfigs, | ||
WorkerConfigs: workerConfigs, | ||
ImageURI: imageRaw, | ||
ImageURI: gcpimage, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any difference between this image URI and the one from the rhcos image asset any more? This code used to get the "raw image" but now gets the same image as the rhcos image asset does. Is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK right, I should have looked at this more closely. Some quick digging with git annotate points to
1f2d071
which is related to the special licenses dance. But in fact, we enable the nested virt license on the public RHCOS image by default since 4.6, which was the main motivation to support license overrides introduced in #3430
So...we could do a prep commit which cleans this up and drops the support from the installer for overriding licenses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Random aside, bits of CoreOS CI testing actually relies on a nested KVM setup in our own Prow instance)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A prep commit would be nice so as not to lose the nuance of this change.
I am not very familiar with how the licenses work. Are you claiming that we do not need to support user-supplied licenses any more? If we switch away from using the raw image, we will lose support for user-supplied licenses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is what I was afraid of. We cannot remove support for licenses. It is a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you claiming that we do not need to support user-supplied licenses any more?
I am 93.2% sure there is no use case for other licenses beyond enable-vmx
(which again is the default now), I looked and couldn't even find references to this. I tried out the "license list" API linked there:
https://cloud.google.com/compute/docs/reference/rest/v1/licenses/list?apix_params=%7B%22project%22%3A%22openshift-gce-devel%22%2C%22returnPartialSuccess%22%3Atrue%7D
and I just got the empty set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I added a hack for this. I'd still argue we can drop the license stuff, but if we decide it all needs to stay as is I can try plumbing through the image URL into the stream metadata.
Came up in review of openshift/installer#4582 (comment) Since error prefixing will use `:`, let's avoid ambiguity. It's going to be generally obvious from context what a stream name and architecture are (particularly the architecture, everyone knows `x86_64` and `aarch64` if they're provisioning a machine for them).
Now merges the changes from #4778 |
/retest |
I assume that this is going to need a rebase now that #4778 has merged. |
This is a new Go library for parsing stream metadata that will be shared by FCOS and RHCOS.
This implements part of the plan from: openshift/os#477 When we originally added the pinned RHCOS metadata `rhcos.json` to the installer, we also changed the coreos-assembler `meta.json` format into an arbitrary new format in the name of some cleanups. In retrospect, this was a big mistake because we now have two formats. Then Fedora CoreOS appeared and added streams JSON as a public API. We decided to unify on streams metadata; there's now a published Go library for it: https://github.com/coreos/stream-metadata-go Among other benefits, it is a single file that supports multiple architectures. UPI installs should now use stream metadata, particularly to find public cloud images. This is exposed via a new `openshift-install coreos print-stream-json` command. This is an important preparatory step for exposing this via `oc` as well as having something in the cluster update to it. HOWEVER as a (really hopefully temporary) hack, we *duplicate* the metadata so that IPI installs use the new stream format, and UPI CI jobs can still use the old format (with different RHCOS versions). We will port the UPI docs and CI jobs after this merges. Co-authored-by: Matthew Staebler <staebler@redhat.com>
Now that we have a standardized JSON format for the bootimages, add a CLI command to print it. This is necessary for UPI installs; previously what we've done for UPI is completely ad-hoc, such as having the docs team copy the AMI list into HTML format. Our UPI CI automation copies `rhcos.json` from the installer git into a container, but customers can't do that. In the past I've even heard of people e.g. searching in AWS for "RHEL CoreOS" and picking AMIs based on that, which is extremely bad because they may not even get something from us. This closes a major gap in allowing customers to automate UPI installs in the same way IPI works. One aside: because the JSON data is embedded in the installer binary, which is in turn embedded in the release image, which is GPG signed, one can transitively verify the integrity via that signature.
Briefly describe the history and future of the pinned {RHEL, Fedora} CoreOS metadata in the installer. Co-authored-by: Matthew Staebler <staebler@redhat.com>
c6c902c
to
7a7b055
Compare
Rebased 🏄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/override ci/prow/e2e-aws-upgrade |
@staebler: Overrode contexts on behalf of staebler: ci/prow/e2e-aws-upgrade In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
🤞 |
@cgwalters: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest |
Split out from openshift#4582 This copies the bits from https://github.com/cgwalters/rhel-coreos-bootimages which builds a ConfigMap out of the stream metadata and injects it into the cluster. We have an `installer` image in the release image today; this adds the "is an operator" label, even though it's not really an operator. We just want the CVO to inject the manifest. Among other important semantics, this will ensure that in-place cluster upgrades that have new pinned CoreOS stream data will have this configmap updated.
Currently, the installer relies on a generated go file for determining the AWS region in which the RHCOS image is published. The `go generate` directive was inadvertently removed in openshift#4582. Rather than resurrecting the directive, this commit removes the generated code in favor of gathering the regions directly from the rhcos stream data when needed. https://issues.redhat.com/browse/CORS-1838
This implements part of the plan from:
openshift/os#477
When we originally added the pinned RHCOS metadata
rhcos.json
to the installer, we also changed the coreos-assembler
meta.json
format into an arbitrary new format in the name of some cleanups.
In retrospect, this was a big mistake because we now have two
formats.
Then Fedora CoreOS appeared and added streams JSON as a public API.
We decided to unify on streams metadata; there's now a published
Go library for it: https://github.com/coreos/stream-metadata-go
Among other benefits, it is a single file that supports multiple
architectures.
UPI installs should now use stream metadata, particularly
to find public cloud images.
This is an important preparatory step for exposing this via
oc
as well as having something in the cluster update toit.