-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disaster recovery scenario 3 #153
Conversation
tests/TestHarness/Node.py
Outdated
@@ -558,6 +563,11 @@ def removeReversibleBlks(self): | |||
reversibleBlks = os.path.join(dataDir, "blocks", "reversible") | |||
shutil.rmtree(reversibleBlks, ignore_errors=True) | |||
|
|||
def removeFinalizersSafetyFile(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably this should be named removeFinalizersSafetyDir
.
tests/disaster_recovery_3.py
Outdated
############################################################### | ||
# disaster_recovery - Scenario 3 | ||
# | ||
# Create integration test with 4 nodes (A, B, C, and D) which each have their own producer and finalizer. The finalizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be each have their
be each has its
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrote the paragraph.
for node in [node0, node1, node2, node3]: | ||
assert not node.waitForLibToAdvance(), "Node advanced LIB after relaunch when it should not" | ||
|
||
Print("Shutdown all nodes to remove finalizer safety data") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't remove finalizer safety data here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do on line 112, that is why we are shutting down.
for node in [node0, node1, node2, node3]: | ||
assert not node.verifyAlive(), "Node did not shutdown" | ||
|
||
for node in [node0, node1, node2, node3]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Print remove finalizer safety data here
tests/disaster_recovery_3.py
Outdated
# Create integration test with 4 nodes (A, B, C, and D) which each have their own producer and finalizer. The finalizer | ||
# policy consists of the four finalizers with a threshold of 3. The proposer policy involves all four proposers. | ||
# | ||
# - At least two of the four nodes should have a LIB N and a finalizer safety information file that locks on a block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably you should say something like "make a condition such that at least two of the four nodes should have ..."
Note:start |
…lso update the wait on node2 and node3 to be on N.
Scenario 3
Create integration test with 4 nodes (A, B, C, and D) which each have their own producer and finalizer. The finalizer policy consists of the four finalizers with a threshold of 3. The proposer policy involves all four proposers.
All nodes are shut down. The reversible blocks on all nodes is deleted. Then restart all nodes from an earlier snapshot.
All nodes eventually sync up to block N. Some nodes will consider block N to LIB but others may not.
Not enough finalizers should be voting because of the lock in their finalizer safety information file. Verify that LIB does not advance on any node.
Cleanly shut down all nodes and delete their finalizer safety information files. Then restart the nodes.
Verify that LIB advances on all nodes and they all agree on the LIB. In particular, verify that block N is the same ID on all nodes as the one before nodes were first shutdown.
Resolves #13