-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ingest/ledgerbackend: Differentiate between isPrepared and isClosed in captive core #4088
Conversation
68d6e53
to
d3b858f
Compare
return xdr.LedgerCloseMeta{}, errors.New("session is closed") | ||
} | ||
|
||
if c.prepared == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the right thing to do here is to use c.prepared
or c.isPrepared(BoundedRange(sequence, sequence))
.
It seems like the range checks follow this line would yield a more accurate error (eg. requested ledger ## is behind captive core stream
vs session not prepared
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that error detail if available, sounds valuable when reading the logs to get that narrow(er) context of the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use isPrepared
. Using c.prepared
alone is not enough because it just means the backend is prepared for any range. As an example, let's say backend was prepared for bounded range 100-200 but someone is requesting 201.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we can't use isPrepared()
here, because this is called from PrepareRange
, so the range may not be prepared yet. A better approach would be to have a separate awaitRange()
function that PrepareRange
could use, but that's a much bigger change here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What Paul said. I'm leaving this check as c.prepared
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, I'm probably not qualified to approve yet as still pretty new on the uptake of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the most important thing we need to fix as explained in #3705 (and also a comment for CaptiveStellarCore.cancel
) is that once Close()
is called is unusable but we don't enforce it leaving users with cryptic errors and/or not working backend.
We need to distinguish 3 states of the backend:
NOT PREPARED
- initial state, user need to callPrepareRange
to do anything.PREPARING/PREPARED
- this is de facto the same state because the backend is not thread-safe and is blocking when callingPrepareRange
orGetLedger
.CLOSED
- backend closed.
return xdr.LedgerCloseMeta{}, errors.New("session is closed") | ||
} | ||
|
||
if c.prepared == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use isPrepared
. Using c.prepared
alone is not enough because it just means the backend is prepared for any range. As an example, let's say backend was prepared for bounded range 100-200 but someone is requesting 201.
@@ -600,11 +608,11 @@ func (c *CaptiveStellarCore) GetLatestLedgerSequence(ctx context.Context) (uint3 | |||
} | |||
|
|||
func (c *CaptiveStellarCore) isClosed() bool { | |||
return c.prepared == nil || c.stellarCoreRunner == nil || c.stellarCoreRunner.context().Err() != nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should check c.closed
here only and because isClosed
is only used internally we can remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it might not be intentionally closed. c.closed
only gets set when we call Close()
, but it could also be closed due to a timeout or some other issue with the c.stellarCoreRunner.context()
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly and this is confusing. closed
means that user called Close()
and CaptiveStellarCore
can no longer be used. However in case of stellarCoreRunner
issues, crashes, etc. it just isn't prepared and can be started again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on this discussion, it sounds like you expect the errored state to be the same as the initial NOT PREPARED
state. This is counter to the behavior in a couple of the tests that check for the core to be closed after these two scenarios:
- the core gets an error getting a ledger
- after the last ledger has been consumed.
I've pulled out the second half of this check for a core error, and prompt the user to call PrepareRange
when there is one. I've also updated the tests to reflect that we now expect stellar-core
to not be closed in these scenarios.
538dc27
to
547de99
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just one comment: we don't need coreHasError
helper.
…lled (#4192) After refactoring in #4088 (and as a result of my wrong comment: #4088#discussion_r764394296) the `CaptiveCoreBackend.isPrepared` method returned `true` if `stellarCoreRunner` process was shutdown without calling `close()` - so in case of binary update but also in case of Stellar-Core crash. This commit fixes this bug by checking if `stellarCoreRunner` context was cancelled (meaning Stellar-Core is closed or closing but not as a result of `close` call). I also removed `isClose` method because it was simply checking `close` variable. All test changes are only adding an extra `context()` call mocks because `isPrepared` now calls it.
PR Checklist
PR Structure
otherwise).
services/friendbot
, orall
ordoc
if the changes are broad or impact manypackages.
Thoroughness
.md
files, etc... affected by this change). Take a look in the
docs
folder for a given service,like this one.
Release planning
needed with deprecations, added features, breaking changes, and DB schema changes.
semver, or if it's mainly a patch change. The PR is targeted at the next
release branch if it's not a patch change.
What
CaptiveCoreBackend
should be closed when the last ledger the core is prepared for has been consumed, or if it is explicitly closed. After the core is closed, andGetLedger
is called on it, it returns the errorsession is closed, call PrepareRange first
.Why
The caller should not be able to prepare a core after it is closed.
Known limitations
N/A