Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for transactions spanning multiple shards #2

Open
bdeggleston opened this issue Apr 3, 2014 · 0 comments
Open

Add support for transactions spanning multiple shards #2

bdeggleston opened this issue Apr 3, 2014 · 0 comments

Comments

@bdeggleston
Copy link
Owner

Queries with multiple keys (touching replicas that do not have overlapping token ranges), or queries that touch every replica (system changes, etc) will need to be serialized.

Multi-replica instances aren't difficult to commit, since once the coordinator gets dependencies from the relevant nodes, it can run a commit. The difficult part is executing the instance, since it will depend on instances that it doesn't know about, and won't, without some complicated and network heavy 'get instance' procedure.

Since the replica won't be executing instances that fall outside of it's token range, it doesn't need to worry about them. Subsequent interfering instances will gain dependencies on the multi-shard instances, serializing instances on each shard. However, the multi-shard instance needs to know that it doesn't need to worry about instances that are controlled by other shards, since they could be instances it hasn't seen yet.

The best solution I've been able to come up with is to put the hashed key(s) in the instance id. So a replica could quickly determine if it needs to wait for an instance to be committed/executed before executing the multi-shard instance.

One thing to keep in mind about multi-key instances is that a basic quorum won't work, since it could easily skip over entire shard replicas. Given a replication factor of 3, an instance with 3 keys, each living on different replicas (9 total), would only require 5 responses. The 4 instances it ignores, could be the replicas for an entire key. The solution to this is to require a quorum of responses for each set of replicas for each key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant