Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ETCM-1048] Remove bestKnownBlockAndLatestCheckpoint cache #1092

Conversation

leo-bogastry
Copy link
Contributor

@leo-bogastry leo-bogastry commented Aug 9, 2021

Description

Continuing the effort of removing the desynchronization between caches and RocksDB the AtomicReference[BestBlockLatestCheckpointNumbers] used as cache was removed and replaced by persistence in RocksDB.

Testing

This branch has been tested against mainnet, deployed in Staging and tested against the ERC20 tests. Report here: https://buildkite.com/input-output-hk/mantis-automation/builds/7468#_

@leo-bogastry leo-bogastry force-pushed the feature/ETCM-1048-remove-bestKnownBlockAndLatestCheckpoint-cache branch from 782f49b to e2bc270 Compare August 9, 2021 07:13
@@ -197,61 +197,6 @@ class ConsensusSpec extends AnyFlatSpec with Matchers with ScalaFutures {
blockQueue.isQueued(oldBlock3.header.hash) shouldBe true
}

it should "get best stored block after reorganisation of the longer chain to a shorter one if desync state happened between cache and db" in new EphemBlockchain {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was removed because testing synchonization between cache and DB does not make sense anymore

Copy link
Contributor

@jvdp jvdp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice simplification 👍

@@ -68,7 +68,7 @@ class SyncController(
def start(): Unit = {
import syncConfig.doFastSync

appStateStorage.putSyncStartingBlock(appStateStorage.getBestBlockNumber())
appStateStorage.putSyncStartingBlock(appStateStorage.getBestBlockNumber()).commit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this change for? (Also elsewhere.)

Copy link
Contributor Author

@leo-bogastry leo-bogastry Aug 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without the commit the data is not saved in the DB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so this was actually a bugfix?

Also, doesn't it require the ()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the (), but it works both ways.
I don't know if an actual bug resulted from that missing commit(), but I know that these put method return a DataSourceBatchUpdate. And this allows to chain several put calls and then commit then atomically. In this case the commit() was clearly forgotten

(startBlock to ((startBlock - blocksToDiscard).max(1)) by -1).foreach { n =>
blockchainReader.getBlockHeaderByNumber(n).foreach { headerToRemove =>
blockchain.removeBlock(headerToRemove.hash, withState = false)
blockchain.removeBlock(headerToRemove.hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to remark: perhaps this parameter was also added to prevent issues with write amplification, but we won't need to actually remove blocks anymore (or at least from sync logic) when we can save tentative chains / blocks. (Just need to make sure the cleanup is done in a decently efficient way.)

@leo-bogastry leo-bogastry force-pushed the feature/ETCM-1048-remove-bestKnownBlockAndLatestCheckpoint-cache branch from e2bc270 to 1a21548 Compare August 9, 2021 15:17
@leo-bogastry leo-bogastry force-pushed the feature/ETCM-1048-remove-bestKnownBlockAndLatestCheckpoint-cache branch from 1a21548 to ad62f4c Compare August 9, 2021 15:18
@@ -33,6 +33,8 @@ trait ObjectGenerators {

def intGen(min: Int, max: Int): Gen[Int] = Gen.choose(min, max)

def posIntGen(min: Int, max: Int): Gen[Int] = Gen.choose(min, max).suchThat(_ > 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this ? would intGen(1, max: Int) not work ? Or did it use 0 during shrinking ? 😞

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I forgot about this issue. I don't know why but this test started failing consistently because it was generating negative numbers

// TODO (maybe ETCM-77): Manage last checkpoint number too
appStateStorage.putBestBlockNumber((startBlock - blocksToDiscard - 1).max(0)).commit()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is saving to the storage removed here?🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After I did all the changes I had a bunch of tests failing in FastSyncItSpec, because the best blocknumber was always a bit inferior than expected, and the issue was that line.
Now that we don't have the caches anymore and always save to storage that instructions was bringing issues.

) extends Blockchain
with Logger {

override def getChainWeightByHash(blockhash: ByteString): Option[ChainWeight] = chainWeightStorage.get(blockhash)

override def getLatestCheckpointBlockNumber(): BigInt =
blockchainMetadata.bestKnownBlockAndLatestCheckpoint.get().latestCheckpointNumber
override def getLatestCheckpointBlockNumber(): BigInt = appStateStorage.getLatestCheckpointBlockNumber()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if getBestBlockNumber() moved to the blockchainReader, then I think getLatestCheckpointBlockNumber() should also go there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I created this ticket https://jira.iohk.io/browse/ETCM-1092

@leo-bogastry leo-bogastry merged commit 5dc67e6 into develop Aug 10, 2021
@leo-bogastry leo-bogastry deleted the feature/ETCM-1048-remove-bestKnownBlockAndLatestCheckpoint-cache branch August 10, 2021 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants