ingest/ledgerbackend: Differentiate between isPrepared and isClosed in captive core #4088

erika-sdf · 2021-11-18T01:37:04Z

PR Checklist

PR Structure

This PR has reasonably narrow scope (if not, break it down into smaller PRs).
This PR avoids mixing refactoring changes with feature changes (split into two PRs
otherwise).
This PR's title starts with name of package that is most changed in the PR, ex.
services/friendbot, or all or doc if the changes are broad or impact many
packages.

Thoroughness

This PR adds tests for the most critical parts of the new functionality or fixes.
I've updated any docs (developer docs, .md
files, etc... affected by this change). Take a look in the docs folder for a given service,
like this one.

Release planning

I've updated the relevant CHANGELOG (here for Horizon) if
needed with deprecations, added features, breaking changes, and DB schema changes.
I've decided if this PR requires a new major/minor version according to
semver, or if it's mainly a patch change. The PR is targeted at the next
release branch if it's not a patch change.

What

CaptiveCoreBackend should be closed when the last ledger the core is prepared for has been consumed, or if it is explicitly closed. After the core is closed, and GetLedger is called on it, it returns the error session is closed, call PrepareRange first.

Why

The caller should not be able to prepare a core after it is closed.

Known limitations

N/A

erika-sdf · 2021-11-18T18:58:23Z

ingest/ledgerbackend/captive_core_backend.go

+		return xdr.LedgerCloseMeta{}, errors.New("session is closed")
+	}
+
+	if c.prepared == nil {


I'm not sure if the right thing to do here is to use c.prepared or c.isPrepared(BoundedRange(sequence, sequence)).

It seems like the range checks follow this line would yield a more accurate error (eg. requested ledger ## is behind captive core stream vs session not prepared).

that error detail if available, sounds valuable when reading the logs to get that narrow(er) context of the issue.

You should use isPrepared. Using c.prepared alone is not enough because it just means the backend is prepared for any range. As an example, let's say backend was prepared for bounded range 100-200 but someone is requesting 201.

Ah, we can't use isPrepared() here, because this is called from PrepareRange, so the range may not be prepared yet. A better approach would be to have a separate awaitRange() function that PrepareRange could use, but that's a much bigger change here.

What Paul said. I'm leaving this check as c.prepared

ingest/ledgerbackend/captive_core_backend.go

ingest/ledgerbackend/captive_core_backend_test.go

sreuland

Nice work, I'm probably not qualified to approve yet as still pretty new on the uptake of code.

ingest/CHANGELOG.md

bartekn

I think the most important thing we need to fix as explained in #3705 (and also a comment for CaptiveStellarCore.cancel) is that once Close() is called is unusable but we don't enforce it leaving users with cryptic errors and/or not working backend.

We need to distinguish 3 states of the backend:

NOT PREPARED - initial state, user need to call PrepareRange to do anything.
PREPARING/PREPARED - this is de facto the same state because the backend is not thread-safe and is blocking when calling PrepareRange or GetLedger.
CLOSED - backend closed.

ingest/ledgerbackend/captive_core_backend.go

bartekn · 2021-11-22T21:05:10Z

ingest/ledgerbackend/captive_core_backend.go

+		return xdr.LedgerCloseMeta{}, errors.New("session is closed")
+	}
+
+	if c.prepared == nil {


You should use isPrepared. Using c.prepared alone is not enough because it just means the backend is prepared for any range. As an example, let's say backend was prepared for bounded range 100-200 but someone is requesting 201.

ingest/ledgerbackend/captive_core_backend.go

bartekn · 2021-11-22T21:17:32Z

ingest/ledgerbackend/captive_core_backend.go

@@ -600,11 +608,11 @@ func (c *CaptiveStellarCore) GetLatestLedgerSequence(ctx context.Context) (uint3
 }

 func (c *CaptiveStellarCore) isClosed() bool {
-	return c.prepared == nil || c.stellarCoreRunner == nil || c.stellarCoreRunner.context().Err() != nil


I think we should check c.closed here only and because isClosed is only used internally we can remove it.

But it might not be intentionally closed. c.closed only gets set when we call Close(), but it could also be closed due to a timeout or some other issue with the c.stellarCoreRunner.context(), no?

Exactly and this is confusing. closed means that user called Close() and CaptiveStellarCore can no longer be used. However in case of stellarCoreRunner issues, crashes, etc. it just isn't prepared and can be started again.

Based on this discussion, it sounds like you expect the errored state to be the same as the initial NOT PREPARED state. This is counter to the behavior in a couple of the tests that check for the core to be closed after these two scenarios:

the core gets an error getting a ledger

after the last ledger has been consumed.

I've pulled out the second half of this check for a core error, and prompt the user to call PrepareRange when there is one. I've also updated the tests to reflect that we now expect stellar-core to not be closed in these scenarios.

bartekn

LGTM! Just one comment: we don't need coreHasError helper.

ingest/ledgerbackend/captive_core_backend.go

…lled (#4192) After refactoring in #4088 (and as a result of my wrong comment: #4088#discussion_r764394296) the `CaptiveCoreBackend.isPrepared` method returned `true` if `stellarCoreRunner` process was shutdown without calling `close()` - so in case of binary update but also in case of Stellar-Core crash. This commit fixes this bug by checking if `stellarCoreRunner` context was cancelled (meaning Stellar-Core is closed or closing but not as a result of `close` call). I also removed `isClose` method because it was simply checking `close` variable. All test changes are only adding an extra `context()` call mocks because `isPrepared` now calls it.

erika-sdf force-pushed the captive branch 2 times, most recently from 68d6e53 to d3b858f Compare November 18, 2021 18:01

erika-sdf linked an issue Nov 18, 2021 that may be closed by this pull request

CaptiveCoreBackend.isClosed can be confused with isPrepared #3705

Closed

erika-sdf force-pushed the captive branch from d3b858f to 1a03b0b Compare November 18, 2021 18:42

erika-sdf marked this pull request as ready for review November 18, 2021 18:45

erika-sdf requested review from bartekn and paulbellamy November 18, 2021 18:45

erika-sdf commented Nov 18, 2021

View reviewed changes

ingest/ledgerbackend/captive_core_backend.go Show resolved Hide resolved

erika-sdf requested a review from a team November 18, 2021 19:00

sreuland reviewed Nov 18, 2021

View reviewed changes

ingest/ledgerbackend/captive_core_backend_test.go Outdated Show resolved Hide resolved

sreuland reviewed Nov 18, 2021

View reviewed changes

tamirms reviewed Nov 19, 2021

View reviewed changes

ingest/CHANGELOG.md Outdated Show resolved Hide resolved

Differentiate between isPrepared and isClosed in captive core.

6fa3587

erika-sdf force-pushed the captive branch from cd7e7d8 to 6fa3587 Compare November 19, 2021 19:42

Merge branch 'master' into captive

1e34917

erika-sdf requested a review from tamirms November 22, 2021 17:17

tamirms approved these changes Nov 22, 2021

View reviewed changes

bartekn reviewed Nov 22, 2021

View reviewed changes

erika-sdf and others added 3 commits November 30, 2021 09:21

Address comments

eaa7245

Merge branch 'stellar:master' into captive

6b3c2e8

Merge branch 'captive' of github.com:erika-sdf/go into captive

d713c20

erika-sdf requested a review from bartekn November 30, 2021 17:27

erika-sdf force-pushed the captive branch 2 times, most recently from 538dc27 to 547de99 Compare December 1, 2021 22:11

add tests

a5ba608

erika-sdf force-pushed the captive branch from 547de99 to a5ba608 Compare December 1, 2021 22:13

Merge branch 'master' into captive

508e27b

bartekn approved these changes Dec 7, 2021

View reviewed changes

ingest/ledgerbackend/captive_core_backend.go Outdated Show resolved Hide resolved

remove coreHasError

21a0e59

Merge branch 'master' into captive

a39933e

erika-sdf merged commit 6d63189 into stellar:master Dec 8, 2021

bartekn mentioned this pull request Jan 23, 2022

ingest/ledgerbackend: Restart Stellar-Core when it's context is cancelled #4192

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ingest/ledgerbackend: Differentiate between isPrepared and isClosed in captive core #4088

ingest/ledgerbackend: Differentiate between isPrepared and isClosed in captive core #4088

erika-sdf commented Nov 18, 2021 •

edited

Loading

erika-sdf Nov 18, 2021

sreuland Nov 18, 2021

bartekn Nov 22, 2021

paulbellamy Nov 30, 2021

erika-sdf Dec 1, 2021

sreuland left a comment

bartekn left a comment

bartekn Nov 22, 2021

bartekn Nov 22, 2021

paulbellamy Nov 30, 2021

bartekn Dec 1, 2021

erika-sdf Dec 1, 2021 •

edited

Loading

bartekn left a comment

ingest/ledgerbackend: Differentiate between isPrepared and isClosed in captive core #4088

ingest/ledgerbackend: Differentiate between isPrepared and isClosed in captive core #4088

Conversation

erika-sdf commented Nov 18, 2021 • edited Loading

PR Structure

Thoroughness

Release planning

What

Why

Known limitations

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sreuland left a comment

Choose a reason for hiding this comment

bartekn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erika-sdf Dec 1, 2021 • edited Loading

Choose a reason for hiding this comment

bartekn left a comment

Choose a reason for hiding this comment

erika-sdf commented Nov 18, 2021 •

edited

Loading

erika-sdf Dec 1, 2021 •

edited

Loading