Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dynamic Protocol State] Protocol state storage #4559

Conversation

durkmurder
Copy link
Member

Context

This PR implements a storage system for protocol state along with auxiliary data structures, storage access layer and caching strategy. Some of important concepts:

  • flow.ProtocolStateEntry is stored in DB and is designed in a way to minimize storage overhead.
  • flow.RichProtocolStateEntry decorates basic flow.ProtocolStateEntry with additional data that has to be queried from storage layer (such as EpochSetup events).
  • storage.ProtocolState operates with flow.ProtocolStateEntry for storing and flow.RichProtocolStateEntry for querying
  • badger.ProtocolState uses a cache with no-op storing capability, in this case cache is populated only on queries. Cache itself operates with flow.RichProtocolStateEntry to avoid querying extra data.

durkmurder added 25 commits July 3, 2023 20:02
…nding on context. Updated usages. Fixed tests
Co-authored-by: Jordan Schalm <jordan@dapperlabs.com>
Base automatically changed from yurii/6232-static-identity-model to feature/dynamic-protocol-state August 23, 2023 17:58
engine/access/rpc/engine.go Outdated Show resolved Hide resolved
model/flow/protocol_state.go Show resolved Hide resolved
Comment on lines +65 to +69
// Common situation during the epoch setup phase for epoch N+1
// * we are currently in Epoch N
// * previous epoch N-1 is known (specifically EpochSetup and EpochCommit events)
// * network is currently in the setup phase for the next epoch, i.e. EpochSetup event (starting setup phase) has already been observed
t.Run("setup-phase", func(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the staking phase we test both scenarios:

  1. previous epoch unknown: first epoch after spork
  2. previous epoch unknown: common case for later epochs in spork

However, for the setup phase we don't test the scenario where no prior epoch is known. I think we should extend the test here.

same for testing the epoch commit phase

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added respective TODO to issue #4649

Comment on lines 92 to 97
// ByBlockID returns the protocol state by block ID.
func (s *ProtocolState) ByBlockID(blockID flow.Identifier) (*flow.RichProtocolStateEntry, error) {
tx := s.db.NewTransaction(false)
defer tx.Discard()
return s.byBlockID(blockID)(tx)
}
Copy link
Member

@AlexHentschel AlexHentschel Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

performance concern

I am would suggest to also include a cache for the secondary index here as well. The Headers implementation provides a nice example, where we have a secondary index by height that is also backed by a cache

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added as TODO to issue #4649

)

// ProtocolState implements persistent storage for storing entities of protocol state.
// Protocol state uses an embedded cache without storing capabilities(store happens on first retrieval) to avoid unnecessary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering how much of a challenge it would be to actually cache the RichProtocolStateEntry on store. At the moment, we insert a ProtocolStateEntry (non-rich) -- presumably this is the reason for not storing the RichProtocolStateEntry. However, to me this seems to be an artifact that could indicate room for improvement in the software design:

  • I think for about 95% of the blocks that we store, we need the RichProtocolStateEntry state later. This is because we finalize 95% of all proposed blocks, i.e. the blocks have children; I would guess we that we query the RichProtocolStateEntry when processing a block's child.
  • To me it would be most natural if the process evolving the identity table produced the RichProtocolStateEntry. The RichProtocolStateEntry would ideally provide methods to generate/return its condensed version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added respective TODO to issue #4649

Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried to address as many comments as I could and added most of the remaining TODOs to #4649

model/flow/protocol_state.go Outdated Show resolved Hide resolved
storage/protocol_state.go Outdated Show resolved Hide resolved
Comment on lines +65 to +69
// Common situation during the epoch setup phase for epoch N+1
// * we are currently in Epoch N
// * previous epoch N-1 is known (specifically EpochSetup and EpochCommit events)
// * network is currently in the setup phase for the next epoch, i.e. EpochSetup event (starting setup phase) has already been observed
t.Run("setup-phase", func(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added respective TODO to issue #4649

)

// ProtocolState implements persistent storage for storing entities of protocol state.
// Protocol state uses an embedded cache without storing capabilities(store happens on first retrieval) to avoid unnecessary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added respective TODO to issue #4649

jordanschalm and others added 2 commits August 25, 2023 14:19
Co-authored-by: Alexander Hentschel <alex.hentschel@axiomzen.co>
@jordanschalm
Copy link
Member

Failing test:

=== RUN   TestNewRichProtocolStateEntry
=== RUN   TestNewRichProtocolStateEntry/staking-root-protocol-state
=== RUN   TestNewRichProtocolStateEntry/staking-phase
=== RUN   TestNewRichProtocolStateEntry/setup-phase
    protocol_state_test.go:84: 
        	Error Trace:	/home/runner/work/flow-go/flow-go/model/flow/protocol_state_test.go:84
        	Error:      	Received unexpected error:
        	            	inconsistent EpochCommit for constucting RichProtocolStateEntry, next protocol state states ID 0000000000000000000000000000000000000000000000000000000000000000 while input event has ID f18f47848fb293468f641c33863dca9e5278fa8e9690f77f7dc96e954ef9221b
        	Test:       	TestNewRichProtocolStateEntry/setup-phase
--- FAIL: TestNewRichProtocolStateEntry (0.03s)
    --- PASS: TestNewRichProtocolStateEntry/staking-root-protocol-state (0.01s)
    --- PASS: TestNewRichProtocolStateEntry/staking-phase (0.01s)
    --- FAIL: TestNewRichProtocolStateEntry/setup-phase (0.02s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xef6a5d]

goroutine 167 [running]:
testing.tRunner.func1.2({0x109aa20, 0x1be4fb0})
	/opt/hostedtoolcache/go/1.20.7/x64/src/testing/testing.go:1526 +0x24e
testing.tRunner.func1()
	/opt/hostedtoolcache/go/1.20.7/x64/src/testing/testing.go:1529 +0x39f
panic({0x109aa20, 0x1be4fb0})
	/opt/hostedtoolcache/go/1.20.7/x64/src/runtime/panic.go:884 +0x213
github.com/onflow/flow-go/model/flow_test.TestNewRichProtocolStateEntry.func3(0xc000201380?)
	/home/runner/work/flow-go/flow-go/model/flow/protocol_state_test.go:86 +0xfd
testing.tRunner(0xc0004389c0, 0x131c0f0)
	/opt/hostedtoolcache/go/1.20.7/x64/src/testing/testing.go:1576 +0x10b
created by testing.(*T).Run
	/opt/hostedtoolcache/go/1.20.7/x64/src/testing/testing.go:1629 +0x3ea
FAIL	github.com/onflow/flow-go/model/flow	0.380s

@durkmurder durkmurder merged commit 620d901 into feature/dynamic-protocol-state Sep 7, 2023
@durkmurder durkmurder deleted the yurii/5529-dynamic-protocol-state-storage-layer branch September 7, 2023 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants