-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discuss] Long term plan for cross-cluster-replication #558
Comments
Thanks @ankitkala for the doc |
Sure, Let me do that. Here are the ones already created for reference: |
@andrross could you help transfer this to the cross cluster replication repo? It looks like that would be a good place to track this. |
@elfisher done |
There have been multiple discussions around cross-cluster-replication which we'd like to reconcile and present a long term plan for CCR plugin.
Full cluster replication with failover:
Currently cross-cluster-replication supports only index level replication. We eventually want to build the support for full cluster replication with failover capabilities. This'd allow users to have a backup cluster ready to serve reads incase the primary cluster goes down. We don't have a clear timelines on when we'll start on this. But the intention here is to share the final desired state which will a major consideration during the design discussions.
CCR move to the core:
opensearch-project/OpenSearch#2872
There have been discussions aroung moving the CCR to the OS core (issue).
Major concern with CCR as plugin was due to dependency of CCR on internal component methods(engine and translog) which are not guaranteed to the external plugin and can easily break if there are any major changes done on the core(example).
Remove CCR dependency on security:
CCR in the current state can't be moved completely to the core. Primary reason is that CCR has a strong dependency on the security plugin. CCR requires multiple persistent tasks on the follower cluster which either monitor the ongoing replications(cluster/index/shard level) or fetch the changes from leader. For these tasks we rely on the FGAC role for authorization. These roles are passed by the user in API when starting the replication and we persist this info as part of Replcation metadata in system index.
Here is the public documentation for the CCR security
For CCR to completely move to OS core, we'll have to rethink our security model for CCR. It'd also be a breaking change for the existing customers. This'll require a dedicated effort and won't be combined with any other projects mentioned below.
Using Segment Replication for CCR:
We're working on exploring Segment Replication(opensearch-project/OpenSearch#3020) for CCR where we'd now sync segments from each replicating shard from leader cluster to follower cluster. To achieve this, we'd be extending/reusing the modules for local Segment Replication. Will add more details when we put up the proposal.
We can't implement this completely in core as we will still be relying on existing APIs and persistent tasks in CCR plugin for managing the replication. However, any implementation that requires knowledge of core OS components would reside in OS core.
Utilizing Remote store & segment replication:
With integration of segment replication with remote store, replicas now would be able to sync segments directly from the remote store. CCR also plans to leverage from similar setup. We'll be ensuring that the design also extends to the cross cluster usecases as well.
Logical replication and fetch from translogs:
After move to segment based replication, we'll still keep supporting logical replication till the existing customers migrate to Segment Replication. Even after the migration, we might want to keep the support for logical intact while the Segment Replication becomes the default choice. Logical replication can be helpful incase we want to explore active-active replication in future. Similarly, doc level filtering in replication can be supported via the logical replication, if required.
For logical replication, the ops from leader shard can be fetched from either lucene or translogs. CCR relies on translogs as we've observed 8-10% CPU impact on leader while fetching the changes from lucene. This is because operations from lucene requires decompression(refer here and here).
We'll be relying on Pluggable Translogs for fetching the operations(#375). It should continue to work as expected with support for translogs in remote store as well. One deviation here is that currently we fetch the operations for primary as well as replicas on the leader shard. This is done for loadbalancing the requests to leader cluster which might not work with remote store where only primary shard would be able to fetch the translogs. We need to either build support for fetching the translogs from both primary & replicas. If we can't have this support incase of remote translogs, we'll need to add additional handling to always rely on primary shard in such cases.
Incase, user has opted for no durability, we can either fallback to lucene or decide to not even support logical replication is such case.
The text was updated successfully, but these errors were encountered: