Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Capability Issue on Restart, Backport to v0.42 #9835
fix: Capability Issue on Restart, Backport to v0.42 #9835
Changes from 6 commits
8570d1b
3d8755f
3fefc72
aee978e
58ad66f
5cd6eba
acc659d
acdbca4
041ccc0
346eb0c
cd92c30
61365bd
1dbb174
e8f1885
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be used if node is restarting without genesis restart (ie. regular restart, upgrade with upgrade module, statesync)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BeginBlocker
will be called on genesis I think (i.e. the 1st block).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but the initialization logic won't get rerun because the initialization flag in memory is already set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, state sync nodes will only run
NewApp
and not InitGenesis or BeginBlocker (before state sync)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is my understanding, please confirm @alexanderbez
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, state-synced nodes will NOT run
InitGenesis
and will not runBeginBlock
until the first block after the state-sync height.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be used if node is restarted from genesis (either a new chain or genesis restart)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so
InitializeAndSeal
no longer initializes any capabilities, it just seals the keeper and ensures the mem store is a mem store. Should the godoc be updated?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct and yes, I break this API on the breaking-changes pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function doesn't do a full initialization - it doesn't call
InitMemStore
. Maybe let's make a note, that this function should be called afterInitMemStore
. So we should update the comments of this function andInitMemStore
function. And in 0.43 we can maybe rework it to make it more clear - by renaming this function tok.Seal(ctx)
BTW: we don't need to check the store type - this should be done in
InitMemStore
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you are sealing in the
InitMemStore
. So we should update the comments of this function andInitMemStore
. And in0.43
we can maybe rework it to make it more clear.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this is a breaking change of the state machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No this shouldn't be breaking the state machine.
InitializeCapability
only writes to the memstores and in-memory go map.The memory store does not get committed along with the rest of the app state. It is in-memory local storage for each node. So changing things inside of it does not break the state machine.
https://github.com/cosmos/cosmos-sdk/blob/master/store/mem/store.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know. Breaking state machine doesn't necessary mean to break the written state. To illustrate this issue:
Node A will have all capabilities initialized, Node B will not have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No this PR just changes where/when the capabilities are initialized.
In 0.42.8, the capabilities are initialized in app start
NewApp
, and then reinitialized on the first transaction that callsGetCapability
.In 0.42.9, the capabilities are initialized at the first abci method after startup (either
InitChain
orBeginBlock
).In either case, the capabilities are initialized before they are requested in a transaction, which is what is necessary for keeping nodes in sync.
There is a bug in 0.42.7 and 0.42.8, so there are situations where its state becomes out of sync with a node from 0.42.6.
However, comparing a node on 0.42.6 to a node on 0.42.9. They may initialize the in-memory capabilities in different places; however both will be fully initialized by the time the capability module gets used and thus will authenticate and reject the same transactions and be perfectly in sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behavior of
InitializeCapability
is different - so an app will need to make more changes in the code than bumping a version. IMHO this is a breaking change - we signal to the app that bumping a version is not enough.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes just bumping the version will not be enough. You will also have to make changes to
app.go
. But it should be clear that you can run gaia with0.42.6
and0.42.9
on the same network and there will be no state divergenceThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we use a keeper variable to cache the state? Now we are doing an extra state hit all the time. Once the response is true, we can cache it in the Keeper private variable (eg
k.initialized
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but note this is a read to an in-memory store, so it's not nearly as expensive as an I/O read. Maybe caching with private variable will be a bit faster so I can do this if desired
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, I made this automatic by looking at the code. I think it's "nice to have"