Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix control plane node join logic #745

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions pkg/cloud/aws/actuators/machine/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,14 @@ go_test(
embed = [":go_default_library"],
deps = [
"//pkg/apis/awsprovider/v1alpha1:go_default_library",
"//pkg/cloud/aws/actuators:go_default_library",
"//pkg/cloud/aws/filter:go_default_library",
"//pkg/cloud/aws/services/awserrors:go_default_library",
"//pkg/cloud/aws/services/ec2/mock_ec2iface:go_default_library",
"//vendor/github.com/aws/aws-sdk-go/aws:go_default_library",
"//vendor/github.com/aws/aws-sdk-go/service/ec2:go_default_library",
"//vendor/github.com/aws/aws-sdk-go/service/ec2/ec2iface:go_default_library",
"//vendor/github.com/golang/mock/gomock:go_default_library",
"//vendor/k8s.io/apimachinery/pkg/apis/meta/v1:go_default_library",
"//vendor/sigs.k8s.io/cluster-api/pkg/apis/cluster/v1alpha1:go_default_library",
"//vendor/sigs.k8s.io/cluster-api/pkg/controller/machine:go_default_library",
Expand Down
50 changes: 33 additions & 17 deletions pkg/cloud/aws/actuators/machine/actuator.go
Original file line number Diff line number Diff line change
Expand Up @@ -96,37 +96,52 @@ func machinesEqual(m1 *clusterv1.Machine, m2 *clusterv1.Machine) bool {
return m1.Name == m2.Name && m1.Namespace == m2.Namespace
}

// isNodeJoin determines if a machine, in scope, should join of the cluster.
// TODO: Make this thread safe kubernetes-sigs/cluster-api#925
// https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/745#discussion_r280506890
func (a *Actuator) isNodeJoin(scope *actuators.MachineScope, controlPlaneMachines []*clusterv1.Machine) (bool, error) {
ashish-amarnath marked this conversation as resolved.
Show resolved Hide resolved
switch set := scope.Machine.ObjectMeta.Labels["set"]; set {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly related to this PR, but instead of switching on the set label, this could just check if scope.Machine.Spec.Versions.ControlPlane != ""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to determine whether or not this machine in scope should join the cluster or not and also be able to handle all "kinds" of machines- controlplane, worker, and any other we might add later.

scope.Machine.Spec.Versions.ControlPlane != "" will just tell us whether this is a controlplane machine or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I do think that we should be able to potentially special case other solutions as well, but currently we have two different ways of determining a control-plane machine within cluster-api upstream (for the purposes of clusterctl) and per-provider. If we can standardize on a single solution for controlplane or not, then it is less cognitive overhead for the end user.

Removing the requirement of the set label for control plane instances would also reduce user confusion when trying to deploy a MachineSet or MachineDeployment, where it isn't overly clear that they need to include those in the labels for the template.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on not having 2 ways of doing the same. As it is not directly related to this PR, I'll follow-up with a PR for this.

case "node":
// Worker machines, not part of the controlplane, will always join the cluster.
return true, nil
case "controlplane":
for _, cm := range controlPlaneMachines {
m, err := actuators.NewMachineScope(actuators.MachineScopeParams{
Machine: cm,
Cluster: scope.Cluster,
Client: a.clusterClient,
Logger: a.log,
})
// Controlplane machines will join the cluster if the cluster has an existing control plane.
controlplaneExists := false
var err error

if err != nil {
return false, errors.Wrapf(err, "failed to create machine scope for machine %q", cm.Name)
}
var sharedScope *actuators.MachineScope

ec2svc := ec2.NewService(m.Scope)
for _, cm := range controlPlaneMachines {
if sharedScope == nil {
sharedScope, err = actuators.NewMachineScope(actuators.MachineScopeParams{
Machine: cm,
Cluster: scope.Cluster,
Client: a.clusterClient,
Logger: a.log,
AWSClients: scope.AWSClients,
})
if err != nil {
return false, errors.Wrapf(err, "failed to create machine scope for %s/%s", cm.Namespace, cm.Name)
}
}
sharedScope.Machine = cm
ec2svc := ec2.NewService(sharedScope.Scope)

ok, err := ec2svc.MachineExists(m)
controlplaneExists, err = ec2svc.MachineExists(sharedScope)
if err != nil {
return false, errors.Wrapf(err, "failed to verify existence of machine %q", m.Name())
return false, errors.Wrapf(err, "failed to verify existence of machine %s/%s", sharedScope.Machine.Namespace, sharedScope.Machine.Name)
}

a.log.V(2).Info("Machine joining control plane", "machine-name", scope.Machine.Name, "machine-namespace", scope.Machine.Namespace, "should-join-control-plane", ok)
return ok, nil
if controlplaneExists {
return controlplaneExists, err
}
}

return false, nil
a.log.V(2).Info("Will machine join the controlplane", "machine-name", scope.Machine.Name, "machine-namespace", scope.Machine.Name, "should-join-control-plane", controlplaneExists)
return controlplaneExists, err

default:
return false, errors.Errorf("Unknown value %q for label `set` on machine %q, skipping machine creation", set, scope.Machine.Name)
return false, errors.Errorf("Unknown value %q for label `set` on machine %q", set, scope.Machine.Name)
}
}

Expand Down Expand Up @@ -160,6 +175,7 @@ func (a *Actuator) Create(ctx context.Context, cluster *clusterv1.Cluster, machi

var bootstrapToken string
if isNodeJoin {
a.log.V(2).Info("Machine will join the cluster", "cluster", cluster.Name, "machine-name", machine.Name, "machine-namespace", machine.Namespace)
coreClient, err := a.coreV1Client(cluster)
if err != nil {
return errors.Wrapf(err, "failed to retrieve corev1 client for cluster %q", cluster.Name)
Expand Down
Loading