-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lighthouse hub-controller enhancement proposal #82
Conversation
🤖 Created branch: z_pr82/donaldh/hub-controller |
Signed-off-by: Donald Hunter <donald.hunter@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current proposal doesn't mention any cons of having central mcs controller. Chiefly, it will require broker cluster to have API access to all other clusters. In current model, all clusters only need credentials to access broker cluster. This was one of the key reasons for why submariner was designed without any controller running on broker.
All these can also be achieved by making current lighthouse agent create the aggregated version of service import on broker.
- When Export is created, agent checks if ServiceImport already exists.
- If not, it creates a new Import on broker based on Service present in cluster.
- If present, it checks if it conflicts with Service Exported.
- If conflicts, set Export status to comflict.
- If not, update the ServiceImport with aggregated information.
- If Service is Exported from two different clusters, only one will be able to successfuly update the copy on broker, other will get a conflict error and will have to retry.
So, by maintaining a single copy on the broker and syncing that, we can achieve same benefits without requiring a central mcs controller. Only drawback of this approach would be too many conflicts when service is exported from multiple clusters at same time, but their frequency shouldn't be high enough. It would only be an issue if two such clusters "restart" at same time.
The proposed approach will not require the broker cluster to have API access to the other clusters.
I agree that a drawback of this approach is the need to run a controller on the broker (highly available). |
Could you clarify this? ServiceExports are not synced to broker today. Or did you mean ServiceImport here? Design of this mcscontroller is very similar to federator that was in use during very early days of submariner - https://github.com/kubernetes-sigs/kubefed You can even see remnants of that code with some APIs/functions that still use "Federator" in their naming. The approach that I mentioned will likely have some complications of its own, but would like if you could think on those lines too. I am not against mcscontroller or in favor of what I proposed, just want to make sure we weigh both options and choose whichever is the best. |
So currently, the service export syncer transforms ServiceExport to ServiceImport and then syncs the ServiceImports to the broker, right? The idea is to separate the transformation and the syncing so that:
This would mean that the controller responsible for creating the ServiceImport resource would have visibility of all ServiceExport objects from the same watcher and can generate the ServiceImport according to MCS specification. All clusters would receive a consistent view of the aggregated ServiceImport. |
But to do that just ServiceExport isn't enough to create ServiceImport. ServiceExport is just service name and namespace, no other relevant information. Broker will also need Service to create ServiceImport and get information like ServiceIP and ports. Without ServiceImport, how will you even detect Export conflict? |
Yep, the current proof of concept serialises the associated service and adds it as a label on the ServiceExport. There could be an internal ‘implementation’ crd instead. It would also be possible to reuse ServiceImport but that is likely to be confusing and would require strict separation between “source” objects and “merged” objects that get synced by clusters. |
That seems a bit kludgy to me. There's also globalnet info needed and we also update the ServiceExport status on the managed cluster. I understand the motivation for implementing a hub MCS controller, as that's how Google has implemented the MCS API and may have even tailored parts of it assuming a central controller, but I just don't see a compelling reason to re-architect Submariner in that manner, especially since a central controller implementation seems more suited where the controller connects to/watches/updates managed cluster resources (a la kubefed). It seems the main motivation here is to add the missing Lighthouse behavior to fully align with the MCS specification which, as @vthapar outlined earlier, can be implemented with the current architecture. |
I agree that a label containing serialised service info is kludgy but it was expedient for a proof of concept. My preference would be to have an implementation CRD that contains all required info for the exported service rather than syncing several different objects to the broker or overloading ServiceImport as a means of describing export details. I think we do need to find a solution to ServiceImport merging that is based on having 1 writer per resource rather than multiple contending writers. A separation of concerns between syncing to the broker and transforming exports to imports would achieve this. |
Note that ServiceImport merging will only be required when service is exported from different cluster. So unless two different clusters happen to export a service at exactly same moment [rare occurence even if done using automated scripts], the conflicts will be very less likely to happen and not frequent enough to be an issue. Between ServiceExport and ServiceImport we already have al the info. Anything else at this point would be superflous. Another option is to aggregate at destination. This will provide single writer for aggregated ServiceImport. This can mean that different clusters can briefly have different views of service imports as they're being distributed, but that can happen even with MCS Controller. Eventual consistency is what really matters in a distributed environment. As Tom pointed out, Google proposed the KEP and API taliored specifically to their implementation using a central MCS Controller. In submariner we were doing things differently and even today some of the deviations from spec are because those work better for us. Another example is not using a Virtual IP in ServiceImport but each individual import carrying ClusterIP of service in that cluster. So, we'll never be 100% compliant by design and if someone wants a 100% compliant central MCS controller based implementation, that is already available off MCS sig. I'd be okay with central controller if there were no other way to solve the problems we're trying to solve. Why not do some PoC around these other two solutions: Aggregate at destination or SingleImport on broker, and then decide what works best? |
Another aspect is that the status of each local user-created ServiceExport needs to be updated which would seem problematic from a hub controller with no access to the managed clusters. |
The proposed approach is consistent with KCP architecture where transformation logic is executed on the kcp cluster and workload clusters run syncers that sync config in one direction and status in the other direction. |
But in order to do the latter, the central controller would need access and credentials to each managed cluster, unless I'm missing something. |
I don't think that would be required because the syncers can run on the workload clusters and connect to the broker cluster using the same authentication model that submariner uses today. @vthapar makes a fair point:
It is possible that in the common case no ServiceImport merging needs to happen, only that the first ServiceExport causes a ServiceImport to be created. Subsequent ServiceExports would only need to sync an EndpointSlice. I think there are several edge cases that need to be considered, such identifying when the ServiceImport can be deleted. There are other issues such as: should the ServiceImport VIP be set at all since DNS resolution should be based on the EndpointSlices. |
FWIU, the hub controller would process the synced |
The hub controller would write status to the ServiceExport object on the hub/broker. The syncer running on the workload cluster would sync the status back to the ServiceExport object on the workload cluster. This is the pattern that kcp syncers use; they sync config in one direction and status updates in the oppposite direction. |
Which seems a lot more complicated, certainly more than it needs to be in our case. |
The current Lighthouse implementation has a passive broker, it is a central apiserver that is | ||
accessed by each cluster in the clusterset. Clusters synchronise resources to and from the | ||
broker with all sync and reconciliation logic implemented in the clusters. | ||
|
||
There are feature gaps that would be much easier to address if we introduce an active broker | ||
that can implement reconciliation logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passive broker is part of the broader Submariner architecture which emphasizes distributed clusters where each cluster is autonomous, and the broker is simply some mechanism to disseminate the data which is very simple (could've been a simple etcd service and not K8s API).
If we're going to change that, then it should be discussed on a higher level architecturally and not just in the context of Lighthouse.
Closing this PR given there is no consensus for introducing controller functionality on the broker. I will focus instead on aggregating service imports for alignment with MCS. |
🤖 Closed branches: [z_pr82/donaldh/hub-controller] |
No description provided.