Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Fix DockerMachine panic #9673

Merged
merged 1 commit into from
Nov 8, 2023

Conversation

makhov
Copy link
Contributor

@makhov makhov commented Nov 6, 2023

What this PR does / why we need it:

The PR fixes panic in the DockerMachineReconciler in case cluster.Spec.InfrastructureRef doesn't yet exist.
When using ClusterClass, the Cluster can contain just spec.topology at the beginning, other fields will be added later by the controller manager.

(the trace below is from the v1.5.3, but the bug still exists in the main branch)

panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x11a40ec]

goroutine 229 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
        sigs.k8s.io/controller-runtime@v0.15.1/pkg/internal/controller/controller.go:115 +0x1a4
panic({0x13564e0, 0x25cf020})
        runtime/panic.go:884 +0x1f4
sigs.k8s.io/cluster-api/test/infrastructure/docker/internal/controllers.(*DockerMachineReconciler).Reconcile(0x400003a700, {0x183a980, 0x4000946570}, {{{0x4000aa6976?, 0x30?}, {0x4000801b80?, 0xffff71389001?}}})
        sigs.k8s.io/cluster-api/test/infrastructure/docker/internal/controllers/dockermachine_controller.go:124 +0x55c
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x183a980?, {0x183a980?, 0x4000946570?}, {{{0x4000aa6976?, 0x12caea0?}, {0x4000801b80?, 0x4000b91e08?}}})
        sigs.k8s.io/controller-runtime@v0.15.1/pkg/internal/controller/controller.go:118 +0x8c
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0x4000466460, {0x183a8d8, 0x40000eb5e0}, {0x13e79a0?, 0x4000370360?})
        sigs.k8s.io/controller-runtime@v0.15.1/pkg/internal/controller/controller.go:314 +0x2cc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0x4000466460, {0x183a8d8, 0x40000eb5e0})
        sigs.k8s.io/controller-runtime@v0.15.1/pkg/internal/controller/controller.go:265 +0x1a0
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
        sigs.k8s.io/controller-runtime@v0.15.1/pkg/internal/controller/controller.go:226 +0x74
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
        sigs.k8s.io/controller-runtime@v0.15.1/pkg/internal/controller/controller.go:222 +0x43c

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-area PR is missing an area label labels Nov 6, 2023
@k8s-ci-robot
Copy link
Contributor

Welcome @makhov!

It looks like this is your first PR to kubernetes-sigs/cluster-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 6, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @makhov. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ok-to-test
/area provider/infrastructure-docker

Q: Can a dockermachine possibly be created in a topology managed cluster where the infrastructureRef isn't set? That's surprising to me.

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. area/provider/infrastructure-docker Issues or PRs related to the docker infrastructure provider and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. do-not-merge/needs-area PR is missing an area label labels Nov 6, 2023
Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good to me - just one Q

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 6, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 5f2b1b16974fa454851ce33dd8ffd445c2ad8990

Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cherry-pick release-1.4

Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cherry-pick release-1.5

@makhov
Copy link
Contributor Author

makhov commented Nov 6, 2023

Q: Can a dockermachine possibly be created in a topology managed cluster where the infrastructureRef isn't set? That's surprising to me.

Yes. I bumped into this when I was working on adding ClusterClass support to the k0smotron Cluster API provider, and had some of the resources misconfigured. CAPI controller manager wasn't able to add infrastructureRef to the Cluster object, which caused panic.

@killianmuldoon
Copy link
Contributor

Yes. I bumped into this when I was working on k0sproject/k0smotron#325 to the k0smotron Cluster API provider, and had some of the resources misconfigured. CAPI controller manager wasn't able to add infrastructureRef to the Cluster object, which caused panic.

Hmm - I guess the topology controller does go ahead and create the MDs etc. even if the Cluster doesn't have an infrastructureRef. Surprising to me, but not necessarily a bug IMO.

@makhov
Copy link
Contributor Author

makhov commented Nov 6, 2023

Hmm - I guess the topology controller does go ahead and create the MDs etc. even if the Cluster doesn't have an infrastructureRef. Surprising to me, but not necessarily a bug IMO.

That's fine, I think, the problem here is the panic.

Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 8, 2023
@k8s-ci-robot k8s-ci-robot merged commit eddb1c6 into kubernetes-sigs:main Nov 8, 2023
17 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.6 milestone Nov 8, 2023
@killianmuldoon
Copy link
Contributor

/cherry-pick release-1.5

@killianmuldoon
Copy link
Contributor

/cherry-pick release-1.4

@k8s-infra-cherrypick-robot

@killianmuldoon: new pull request created: #9689

In response to this:

/cherry-pick release-1.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-infra-cherrypick-robot

@killianmuldoon: new pull request created: #9690

In response to this:

/cherry-pick release-1.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/infrastructure-docker Issues or PRs related to the docker infrastructure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants