Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agnhost: Try both in-cluster and external discovery #101589

Merged
merged 1 commit into from
Apr 30, 2021

Conversation

mtaufen
Copy link
Contributor

@mtaufen mtaufen commented Apr 28, 2021

The conformance test for ServiceAccountIssuerDiscovery is currently
configured with --in-cluster-discovery, which only supports token
validation against in-cluster endpoints. Many cloud providers provide
their own, external endpoints for OIDC discovery, and because the iss
claim in tokens will point to these endpoints, but the client in this
test only trusts the Cluster CA, it will fail to connect to the external
discovery endpoints when validating the token.

To ensure that the conformance test at least supports scenario where
both the discovery doc endpoint and JWKS endpoint are cluster-local and
the scenario where both endpoints are cluster-external, this PR has the
test try both and requires at least one to pass.

Caveat: The test still won't support a configuration where one
endpoint is cluster-local and the other is external. We don't yet have
evidence that this is a configuration that is used in practice, so this
initial hotfix will at least fix the conformance test for the "both
external" configuration we know providers already use. Note that if one
endpoint is cluster-local, and the other is cluster-external, tokens can
still only be validated in-cluster, because both endpoints must be
accessible to Relying Parties that validate tokens.

What type of PR is this?

/kind bug
/kind failing-test

Does this PR introduce a user-facing change?

Resolves an issue with the "ServiceAccountIssuerDiscovery should support OIDC discovery" conformance test failing on clusters which are configured with issuers outside the cluster

Hold since I still need to test this for GKE to ensure it can succeed in that environment:
/hold

/assign @liggitt

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Apr 28, 2021
@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 28, 2021
@liggitt
Copy link
Member

liggitt commented Apr 28, 2021

need to bump test/images/agnhost/VERSION as well

@mtaufen
Copy link
Contributor Author

mtaufen commented Apr 28, 2021

Bumped.

@liggitt
Copy link
Member

liggitt commented Apr 29, 2021

looks like build/dependencies.yaml is unhappy, and will want you to update test/images/agnhost/agnhost.go as well

@liggitt
Copy link
Member

liggitt commented Apr 29, 2021

looks reasonable otherwise

@liggitt liggitt added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Apr 29, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Apr 29, 2021
@liggitt liggitt added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Apr 29, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 29, 2021
@liggitt liggitt added the sig/auth Categorizes an issue or PR as relevant to SIG Auth. label Apr 29, 2021
@liggitt liggitt added this to the v1.21 milestone Apr 29, 2021
@mtaufen
Copy link
Contributor Author

mtaufen commented Apr 29, 2021

/hold cancel

I've tested this on GKE and it works with our cluster-external endpoints.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 29, 2021
The conformance test for ServiceAccountIssuerDiscovery is currently
configured with --in-cluster-discovery, which only supports token
validation against in-cluster endpoints. Many cloud providers provide
their own, external endpoints for OIDC discovery, and because the iss
claim in tokens will point to these endpoints, but the client in this
test only trusts the Cluster CA, it will fail to connect to the external
discovery endpoints when validating the token.

To ensure that the conformance test at least supports scenario where
both the discovery doc endpoint and JWKS endpoint are cluster-local and
the scenario where both endpoints are cluster-external, this PR has the
test try both and requires at least one to pass.

Caveat: The test still won't support a configuration where one
endpoint is cluster-local and the other is external. We don't yet have
evidence that this is a configuration that is used in practice, so this
initial hotfix will at least fix the conformance test for the "both
external" configuration we know providers already use. Note that if one
endpoint is cluster-local, and the other is cluster-external, tokens can
still only be validated in-cluster, because both endpoints must be
accessible to Relying Parties that validate tokens.
@liggitt
Copy link
Member

liggitt commented Apr 29, 2021

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 29, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, mtaufen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 29, 2021
@mtaufen
Copy link
Contributor Author

mtaufen commented Apr 29, 2021

sg, thanks!

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@mtaufen
Copy link
Contributor Author

mtaufen commented Apr 29, 2021

/retest

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

4 similar comments
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit f039f94 into kubernetes:master Apr 30, 2021
@k8s-ci-robot k8s-ci-robot modified the milestones: v1.21, v1.22 Apr 30, 2021
mtaufen added a commit to mtaufen/k8s.io that referenced this pull request May 3, 2021
This promotes the version of agnhost with the conformance test fix from
kubernetes/kubernetes#101589
mtaufen added a commit to mtaufen/kubernetes that referenced this pull request May 3, 2021
Updates e2e tests to use agnhost 2.32, which fixes an issue with the
conformance tests for ServiceAccountIssuerDiscovery.

Original fix: kubernetes#101589

Image promotion: kubernetes/k8s.io#1994
mtaufen added a commit to mtaufen/kubernetes that referenced this pull request May 4, 2021
Updates e2e tests to use agnhost 2.32, which fixes an issue with the
conformance tests for ServiceAccountIssuerDiscovery.

Original fix: kubernetes#101589

Image promotion: kubernetes/k8s.io#1994
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels May 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants