-
Notifications
You must be signed in to change notification settings - Fork 357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FederatedGraphStore double-caching causes desync issues in replicated deployment #2457
Comments
…ion with FederatedGraphStorage and further testing.
|
examine commit 1324d92 |
…ion with FederatedGraphStorage and further testing.
Thanks @GCHQDev404. For clarity, I think instead of "Shared Caches" I should probably have said something like "CacheLoaders with a common persistent store". This is required whenever you have multiple instances of gaffer running over the same data for resiliency or scalability. I.e. imagine you have at least two instances of Gaffer behind a load balancer, running over the same data. A user creates a graph on instance A, and then another user queries for the list of all graphs on instance B. We want (with maybe some acceptable level of lag) the user querying on B to be aware of the graph created on A. Again, I'm still not 100% sure this is something Gaffer actually claims to support right now, although it seems like a reasonable requirement. But might make this a feature request rather than a bugfix and hence affect priority. Re: the included commit. Thanks v much for this, it's the kind of thing I was thinking of. One thought: If a graph has been changed, rather than added or deleted (or a graph has been deleted and then another with the same name added), then will this cause OverWritingExceptions from the checkExisting and validateExisting methods? Sorry for not giving you a unit test for this, I'm writing this between meetings but will try to drop one in later today if it's still relevant (ETA clarification on "Shared Caches") |
…hronisation with FederatedGraphStorage and further testing." This reverts commit 1324d92
@sw96411 Please examine the latest branch. https://github.com/gchq/Gaffer/tree/gh-2457-double-caching-issue |
Just leaving a quick note based on our offline conversations to say that following an e2e test, this code in itself looks good, but performance issues arise, at least partly due to #2478. I think we suggested looking to minimise the frequency with which we call .getGraph() on GraphSerialisables in this code as part of the solution, but it's the E2E performance the matters rather than any particular solution |
Branch has been refactored, JobTracker cache has been changed to accept a naming suffix the same as FederatedStorageCache. Please examine https://github.com/gchq/Gaffer/compare/gh-2457-double-caching-issue |
This contains many breaking changes and will need to be reviewed before merging into V1 Gaffer. |
Just a quick note to say that the code looks good and I've had a quick test against my team's dev environment. Although I wasn't able to do a full E2E test due to dependency issues, I was able to validate that the performance issues mentioned above no longer occur. I think this code is ready to enter the formal merge/test pipeline at the author's convenience |
P.S. Thanks v much @GCHQDev404 for the excellent work on this issue :-) |
…nto gh-2457-double-caching-issue-merge-alpha1 # Conflicts: # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorage.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorageTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorageTraitsTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreToFederatedStoreTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/integration/FederatedAdminIT.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedAddGraphWithHooksHandlerTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedAggregateHandlerTest.java
…' into gh-2457-double-caching-issue-merge-alpha3 # Conflicts: # core/graph/src/main/java/uk/gov/gchq/gaffer/graph/Graph.java # core/graph/src/main/java/uk/gov/gchq/gaffer/graph/GraphSerialisable.java # core/graph/src/test/java/uk/gov/gchq/gaffer/graph/GraphSerialisableTest.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorage.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedStore.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/FederatedAddGraphHandlerParent.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorageTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreAuthTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreCacheTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreGetTraitsTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreGraphVisibilityTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStorePublicAccessTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreToFederatedStoreTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreWrongGraphIDsTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/integration/FederatedAdminIT.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedAddGraphHandlerTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedAddGraphWithHooksHandlerTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedAggregateHandlerTest.java
…ha' into gh-2457-double-caching-issue-merge-alpha3 # Conflicts: # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorage.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedStore.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/operation/FederatedOperationChainValidator.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedOperationHandler.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorageTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreDefaultGraphsTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/integration/FederatedAdminIT.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/integration/FederatedStoreRecursionIT.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/FederatedOperationHandlerTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedAddGraphHandlerTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedAddGraphWithHooksHandlerTest.java
…ha' into gh-2457-double-caching-issue-merge-alpha3 # Conflicts: # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorage.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedStore.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/operation/FederatedOperationChainValidator.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedOperationHandler.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorageTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreDefaultGraphsTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/integration/FederatedAdminIT.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/integration/FederatedStoreRecursionIT.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/FederatedOperationHandlerTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedAddGraphHandlerTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedAddGraphWithHooksHandlerTest.java
…ha' into gh-2457-double-caching-issue-merge-alpha3 # Conflicts: # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorage.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedStore.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/operation/FederatedOperationChainValidator.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedOperationHandler.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorageTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreDefaultGraphsTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/integration/FederatedAdminIT.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/integration/FederatedStoreRecursionIT.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/FederatedOperationHandlerTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedAddGraphHandlerTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedAddGraphWithHooksHandlerTest.java
…ha' into gh-2457-double-caching-issue
…hanging GraphSerialisable causes backwards compatability issues.
Migration The Old cache will not work because the name of the cache will not be found/used. |
* gh-2457-double-caching-issue weak initial step, requires synchronisation with FederatedGraphStorage and further testing. * gh-2457-double-caching-issue remove FederatedGraphStorage local map, using cache only. * gh-2457-double-caching-issue remove FederatedGraphStorage test fixes * gh-2457-double-caching-issue remove FederatedGraphStorage review. * gh-2457-double-caching-removing-graphstorage minimising use of GraphSerialisable.getGraph() * gh-2457-double-caching-removing-graphstorage gh-2478 JobTracker cache can have Suffix name. * gh-2457 double caching issue fix for persisting graph names in tests. * Merge remote-tracking branch 'origin/v2-alpha' into gh-2357-federatedstore-federated-operation-merge-alpha2 !!!With 1 failing class of Tests!!! * gh-2457 GraphSerialisable not being able to Mock has failing tests. changing GraphSerialisable causes backwards compatability issues. * gh-2457 Fixed GraphSerialisable equals. * gh-2457 checkstyle * gh-2457 PR requests. * gh-2457 PR requests. * gh-2457 PR requests.
Closed by #2595 |
…re-RemoveGraphAndDeleteAccumuloTable # Conflicts: # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorage.java
…re-federated-operation-merge-mapping # Conflicts: # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorage.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedStore.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreProperties.java # store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/impl/FederatedOperationHandler.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreDefaultGraphsTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/FederatedStoreTest.java # store-implementation/federated-store/src/test/java/uk/gov/gchq/gaffer/federatedstore/operation/handler/FederatedOperationHandlerTest.java
FederatedGraphStore maintains both a uk.gov.gchq.gaffer.cache.Cache of the various Graphs managed by a federated store, and it's own private Map<FederatedAccess, Set> named 'storage' (the latter storing deserialised objects).
There is no mechanism to ensure that the Map reflects any changes in the Cache. This could be caused, for instance, by a deployment scenario where the customer is running multiple instances of Gaffer and trying to use a shared CacheLoader implementation to share data between them. This is demonstrated (mostly. Sorta) by the unit test in sw96411@15a2cdd
The trivial solution is simply to abandon the use of the private Map and always deserialise the objects from the Cache on demand. I'm reluctant to suggest this, though, as it would greatly increase the cost of calling any of the methods on FederatedStore.
Other options available include refreshing the Map from the Cache at a maximum frequency (even once per second would in theory save a lot of deserialisation), writing a serial number to the cache on modification and then testing this serial number on read, or creating a specialisation of Cache which can record the last time the Cache (or part thereof) was modified and then delegating the handling of this issue off to that somehow.
I'd be happy to PR in a solution based on any of the above suggestions, but before I do, I wanted to check with the collective:
Tagging in @GCHQDev404 as I see they have made commits to related issues and @m29827
as they were the last editor on FederatedGraphStorage.
Grateful for thoughts from any of the community, of course!
The text was updated successfully, but these errors were encountered: