-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stronger link between Machine* <-> Cluster #728
Stronger link between Machine* <-> Cluster #728
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vincepri The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
2f4f2a6
to
cc812ca
Compare
/assign @roberthbailey @detiber |
pkg/controller/machine/controller.go
Outdated
@@ -194,9 +194,14 @@ func (r *ReconcileMachine) Reconcile(request reconcile.Request) (reconcile.Resul | |||
} | |||
|
|||
func (r *ReconcileMachine) getCluster(ctx context.Context, machine *clusterv1.Machine) (*clusterv1.Cluster, error) { | |||
if machine.Labels["cluster"] == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason this is using a label is still so that people can fake a cluster-id if they don't want to use the cluster object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone brought up that labels were preferred during our weekly meeting. This wouldn't make the cluster optional, but users might use a dummy cluster if they don't want to make sure of the Cluster type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why labels and/or annotations where mentioned as being preferred is because they are a much weaker link between the objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is "cluster" label become a required field in machineSpec? i am debating whether we should provide flexibility here, if "cluster" label not been set, use namespace to find cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should either be explicit or return no cluster. Returning the first cluster in a namespace (which is the current behavior) might bring unexpected behavior for consumers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to use labels as the governing mechanism to select a cluster, I would imagine using something like a LabelSelector in the machine spec. e.g.
/// [MachineSpec]
// MachineSpec defines the desired state of Machine
type MachineSpec struct {
...
ClusterSelector metav1.LabelSelector `json:"clusterSelector"`
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding selectors: I think that's a mechanism to select a set of things and not necessarily a single thing. It would make more sense having a selector in the Cluster type to select machines. But this doesn't help much when going from machine to cluster.
If you're using a single label, you could consider a non-controller owner reference in the metadata. If you wanted to take it even further, machine deployments and machine sets could have this too. This has the added benefit of kubectl delete -n <namespace> cluster <clustername>
doing a cascading deletion across the resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maisem I'm not sure how a cluster selector would work in this case. The label selector wouldn't be able to select a cluster by name given that the name is a field.
As @krousey pointed out, label selector are more suited when querying list of resources. The problem that I see with ownerRef
though is that's a larger change and impacts delete which I think it should be revisited during the next iteration.
For this cycle I think having a prefixed label in the machine that points to a cluster name should fix the immediate issue and won't bring in many changes to the current logic and providers.
I'll open issues to address these long-term concerns, how does that sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the added benefit of
kubectl delete -n <namespace> cluster <clustername>
doing a cascading deletion across the resources.
@krousey In the case of the AWS provider, the cloud-provider objects managed by the Cluster object are unable to be deleted while there are existing Machines, so the cascading deletion would not happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@detiber Sounds like you want foreground cascading deletion then. https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/#foreground-cascading-deletion
@vincepri This has been an issue for a while. The ownerRef
change is something that would need a wider discussion too.
@@ -37,6 +37,9 @@ func TestReconcileRequest(t *testing.T) { | |||
Name: "create", | |||
Namespace: "default", | |||
Finalizers: []string{v1alpha1.MachineFinalizer}, | |||
Labels: map[string]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should use an annotation here rather than a label, that said, it would require modifying the clusterctl logic related to determining if the controlplane machine is "ready".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue mentions "stronger" link, which I guess a weak one is stronger than no link. Although I'm wondering if we should make this a field in the MachineSpec instead of a label. What are pros for annotations instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting consensus on a field in the MachineSpec might be difficult, but annotations are not as user visible and less likely for users to modify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I'm thinking though that labels might be more useful to query for example machines in a cluster, where annotations is mostly extra metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vincepri according to the k8s docs, Lables are not meant to be meaningful and relevant to users, but do not directly imply semantics: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
While one of the uses for annotations is said to be "Directives from the end-user to the implementations to modify behavior or engage non-standard features": https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/
Since the presence (or absence) of the label changes behavior in the underlying implementation, it would seem to fit more with the annotation use case rather than the label use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubeadm
uses a label to mark master nodes. Since labels are searchable, maybe they are better for identifying information:
You can use either labels or annotations to attach metadata to Kubernetes objects. Labels can be used to select objects and to find collections of objects that satisfy certain conditions. In contrast, annotations are not used to identify and select objects.
f470475
to
5dfe8c1
Compare
@@ -194,9 +194,15 @@ func (r *ReconcileMachine) Reconcile(request reconcile.Request) (reconcile.Resul | |||
} | |||
|
|||
func (r *ReconcileMachine) getCluster(ctx context.Context, machine *clusterv1.Machine) (*clusterv1.Cluster, error) { | |||
if machine.Labels["cluster"] == "" { | |||
klog.Warningf("Machine %q in namespace %q has no cluster label, returning nil", machine.Name, machine.Namespace) | |||
return nil, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this return an error, instead of nil
for error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #728 (comment), I think it makes sense to allow Machines function without a cluster, but open to different opinions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be just info rather than a warning, see same discussion #644 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vincepri I am not sure I understand what "allow machines to function without a cluster means". If it means that the machines should be created/updated/deleted, then we should return and should allow the controller to perform the reconciliation. Correct me if I am wrong.
This is somehow related to #644 |
@enxebre I think it would be overly disruptive to change the machine actuator interface at this point as we are trying to get to v1alpha1 especially since it would require non-trivial changes to all of the current provider implementations. I also don't think it makes sense to require provider implementations to recreate the same logic. I think we should leverage the proposal that @vincepri and @roberthbailey have started for post-v1alpha1 to refactor the types for this: https://docs.google.com/document/d/1pzXtwYWRsOzq5Ftu03O5FcFAlQE26nD3bjYBPenbhjg/edit# |
f95bc2b
to
f412e47
Compare
@detiber ptal |
@vincepri overall, lgtm. Are there any gitbook changes that are needed for this change? |
CC @davidewatson, are the changes for gitbook going into a different branch? |
@vincepri: The documentation source exists in the |
Machines can be associated with a Cluster using a custom label | ||
`cluster.k8s.io/clusterName`. Providers using the `Cluster` controller must | ||
provide the label which references the name of a cluster living in the same | ||
namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads a bit awkward to me as if the label is used by the Cluster
controller.
What about:
Machines can optionally be associated with a Cluster using a custom label... If the label is set, then it must reference the name of a cluster residing in the same namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded, thanks!
bf6bc41
to
a30b436
Compare
/hold lgtm, but adding hold to allow additional feedback prior to the bot merging. |
/test pull-cluster-api-test |
a30b436
to
e23e476
Compare
@detiber ptal. |
/lgtm |
/hold cancel |
/hold |
Signed-off-by: Vince Prignano <vincepri@vmware.com>
e23e476
to
83e4671
Compare
/hold cancel |
@detiber @ncdc label is updated to |
/lgtm |
Signed-off-by: Vince Prignano vincepri@vmware.com
What this PR does / why we need it:
This PR allows users to specify which cluster a machine belongs to via the use of labels or annotations. It also solves a major UX issue where users can't create multiple clusters per namespace.
Given that some providers are only using the machine actuator and don't require a cluster, if the cluster name is empty the machine will operate on a
nil
cluster.Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #41
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
Release note: