Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rhcos: ami regions from rhcos stream at runtime #5466

Merged
merged 2 commits into from
Dec 10, 2021

Conversation

staebler
Copy link
Contributor

@staebler staebler commented Dec 8, 2021

Currently, the installer relies on a generated go file for determining the AWS region in which the RHCOS image is published. The go generate directive was inadertently removed in #4582. Rather than resurrecting the directive, this commit removes the generated code in favor of gathering the regions directly from the rhcos stream data when needed.

https://issues.redhat.com/browse/CORS-1838

@staebler staebler force-pushed the generate_ami_regions branch from 2838edf to e3e5cc0 Compare December 8, 2021 21:27
@staebler staebler force-pushed the generate_ami_regions branch 2 times, most recently from 0e03394 to 337560d Compare December 9, 2021 00:57
Copy link
Contributor

@patrickdillon patrickdillon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks good to me but left a few suggestions. I don't have much background in this area so I'm looking at ce9dd4d for context. According to that commit message, this approach would limit unit testing. I don't see any changes to unit testing, & I don't see any tests failing so this doesn't really seem like a concern. Also I'm not sure I understand/buy that this would block testing.


// AMIRegions returns the AWS regions in which an RHCOS AMI for the specified architecture is published.
func AMIRegions(architecture types.Architecture) sets.String {
stream, err := FetchCoreOSBuild(context.Background())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask whether this context should have a timeout but after reading the underlying code I think:
a) This is just accessing data on disk and not over the network (so it's not going to timeout)
b) The underlying code does not use the context: https://github.com/openshift/installer/blob/master/pkg/rhcos/builds.go#L21

This is not a suggestion... just thought it was interesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good observation. I will open another PR to remove the context from the Fetch functions.

Comment on lines +16 to +18
if err != nil {
logrus.Error("could not fetch the rhcos stream data: %w", err)
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some issues with this error handling. At a minimum, A nil return value needs to be handled by the calling function. This doesn't seem like a case where we should use logrus, but rather the error should be handled by the calling function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does nil need to be handled by the caller? A nil return means that no regions were found. Ultimately, the error is a result of previous code failing to validate the architecture appropriately. The only reason I am printing an error at all is for the benefit of a future developer knowing that they forgot to do something when adding a new architecture.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason I am printing an error at all is for the benefit of a future developer knowing that they forgot to do something when adding a new architecture.

Also, this explanation makes sense, so if we expect this to never be nil except when there's a programmers error I'm fine with letting it cause an exception (not handle the nil)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you return nil rather than an empty set, won't this cause an exception here: https://github.com/openshift/installer/pull/5466/files#diff-bab1b8072f369df53e27d65607145fb93caabd3a86fdefbcafb176e823726069R83?

Apparently it doesn't cause an exception. Not sure I understand why not... to me it appears to be calling Has() on nil but testing here seems ok: https://go.dev/play/p/aikW4rf-4TJ

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, note that in go it is valid to call functions on a nil object. The receiver of the function is ultimately just a parameter passed to the function

}
default:
return field.ErrorList{field.NotSupported(field.NewPath("controlPlane", "architecture"), config.ControlPlane.Architecture, awsvalidation.ValidArchitectureValues)}
if rhcos.AMIRegions(config.ControlPlane.Architecture).Has(config.Platform.AWS.Region) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned below, it seems like AMIRegions can return nil, so we need a check here.

Suggested change
if rhcos.AMIRegions(config.ControlPlane.Architecture).Has(config.Platform.AWS.Region) {
if regions := rhcos.AMIRegions(config.ControlPlane.Architecture); regions != nil && regions.Has(config.Platform.AWS.Region) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@staebler
Copy link
Contributor Author

staebler commented Dec 9, 2021

This generally looks good to me but left a few suggestions. I don't have much background in this area so I'm looking at ce9dd4d for context. According to that commit message, this approach would limit unit testing. I don't see any changes to unit testing, & I don't see any tests failing so this doesn't really seem like a concern. Also I'm not sure I understand/buy that this would block testing.

We are giving up code coverage of the AMI validation in the case where the AMI is published in a region in a partition other than the default partition. We don't currently publish in any region other than regions in the default partition. I would argue that the unit tests should not be relying on the actual regions where the images are published but should instead be mocking out the call to AMIRegions so that we can have more robust code coverage.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 9, 2021
Currently, the installer relies on a generated go file for determining
the AWS region in which the RHCOS image is published. The `go generate`
directive was inadvertently removed in openshift#4582.
Rather than resurrecting the directive, this commit removes the generated
code in favor of gathering the regions directly from the rhcos stream
data when needed.

https://issues.redhat.com/browse/CORS-1838
Since the ValidArchitectureValues variable is no longer used outside of
the pkg/types/aws/validation package, it can be made private. The use
of the variable outside of the package was removed when validation of
the architecture was removed from the validation of the AMI.
@staebler staebler force-pushed the generate_ami_regions branch from 337560d to 547ec65 Compare December 9, 2021 20:20
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 9, 2021
@staebler
Copy link
Contributor Author

staebler commented Dec 9, 2021

337560d4c...547ec65a0

  • Fix conflict where the generate was commented out of ./hack/verify-codegen.sh.

@patrickdillon
Copy link
Contributor

/approve
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 9, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 9, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: patrickdillon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 9, 2021
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

10 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

7 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 10, 2021

@staebler: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-libvirt 547ec65 link false /test e2e-libvirt
ci/prow/e2e-crc 547ec65 link false /test e2e-crc
ci/prow/e2e-ovirt 547ec65 link false /test e2e-ovirt
ci/prow/e2e-aws-single-node 547ec65 link false /test e2e-aws-single-node
ci/prow/e2e-metal-ipi-ovn-ipv6 547ec65 link false /test e2e-metal-ipi-ovn-ipv6
ci/prow/e2e-aws-workers-rhel7 547ec65 link false /test e2e-aws-workers-rhel7

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

16 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 305ab7f into openshift:master Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants