Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: serve forest CAR file as Blockstore #3365

Merged
merged 62 commits into from
Aug 31, 2023
Merged

Conversation

hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Aug 14, 2023

Summary of changes

As part of #3334

Changes introduced in this pull request:

  • Serve multiple forest CAR files as Blockstore. Design discussion on slack

Running on mainnet on a DO droplet for ~1 week, everything including GC seems to work fine.
(With the Mmap change in #3414 GC time reduces by ~50%)

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

@hanabi1224 hanabi1224 marked this pull request as ready for review August 16, 2023 07:36
@hanabi1224 hanabi1224 requested a review from a team as a code owner August 16, 2023 07:36
@hanabi1224 hanabi1224 requested review from creativcoder and LesnyRumcajs and removed request for a team August 16, 2023 07:36
@hanabi1224 hanabi1224 force-pushed the hm/use-snapshot-file-as-db branch from 990732f to f434632 Compare August 16, 2023 08:59
src/db/car/forest.rs Outdated Show resolved Hide resolved
@@ -709,14 +708,168 @@ fn create_password(prompt: &str) -> std::io::Result<String> {
.interact_on(&term)
}

pub async fn open_forest_car_union_db(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd try to not bloat this file any further, it's already quite big. Maybe let's have a dedicated one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to db_util.rs

@@ -709,14 +708,168 @@ fn create_password(prompt: &str) -> std::io::Result<String> {
.interact_on(&term)
}

pub async fn open_forest_car_union_db(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method unit-tested?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_prepare_and_open_forest_car_union_db added

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then removed, as this function no longer exists

src/daemon/mod.rs Outdated Show resolved Hide resolved
src/daemon/mod.rs Outdated Show resolved Hide resolved
@@ -695,14 +694,170 @@ fn create_password(prompt: &str) -> std::io::Result<String> {
.interact_on(&term)
}

pub async fn open_forest_car_union_db(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this function is supposed to do. It appears that it does several unrelated things (like opening the current database AND handling snapshot importing). Could you write some documentation for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation, to avoid creating a wrapper type of Mutex<ManyCar> that does not require &mut self for APIs like fn read_only, snapshot import is moved to before DB initialization, thus any functions that require a mutable ManyCar is moved into this function(snapshot import, auto download snapshot).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc added

Copy link
Contributor

@lemmih lemmih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR contains a lot of changes that I need help to understand. Let's hold off merging this until the code is well-documented and intuitive.

src/daemon/mod.rs Outdated Show resolved Hide resolved
src/daemon/mod.rs Outdated Show resolved Hide resolved
src/daemon/mod.rs Outdated Show resolved Hide resolved
src/daemon/mod.rs Outdated Show resolved Hide resolved
src/daemon/mod.rs Outdated Show resolved Hide resolved
src/daemon/db_util.rs Show resolved Hide resolved
src/daemon/db_util.rs Outdated Show resolved Hide resolved
src/daemon/db_util.rs Outdated Show resolved Hide resolved
src/daemon/db_util.rs Outdated Show resolved Hide resolved
src/daemon/mod.rs Show resolved Hide resolved
src/utils/io/mmap.rs Outdated Show resolved Hide resolved
src/daemon/mod.rs Outdated Show resolved Hide resolved
@hanabi1224 hanabi1224 force-pushed the hm/use-snapshot-file-as-db branch from 8657b05 to 8e5b137 Compare August 30, 2023 09:23
.context("Failed miserably while importing chain from snapshot")?;
info!("Imported snapshot in: {}s", stopwatch.elapsed().as_secs());
.await?;
consume_snapshot_file = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to set this flag as true. All downloaded files are always consumed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Copy link
Contributor

@lemmih lemmih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db_util.rs can be refactored away, but let's leave that for another PR.

Copy link
Member

@LesnyRumcajs LesnyRumcajs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have good coverage for this feature? E.g., scenarios with multiple CAR files?

RetryArgs {
timeout: None,
max_retries: Some(3),
delay: Some(Duration::from_secs(60)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to have it somehow parametrized?

Copy link
Contributor Author

@hanabi1224 hanabi1224 Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's better to not ask the function caller to decide these parameters, for simplicity. How about changing it to

RetryArgs {
    timeout: None,
    ..Default::default()
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

}

pub async fn download_file_with_retry(
url: Url,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be owned?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not here, fixed

// Import chain if needed
if !opts.skip_load.unwrap_or_default() {
if let Some(path) = &config.client.snapshot_path {
// TODO: respect `--consume-snapshot` CLI option once it's implemented
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a tracking issue for this?

Copy link
Contributor Author

@hanabi1224 hanabi1224 Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's part of #3334

Add flag --consume-snapshot. It should do the same as --import-snapshot, but it'll move or delete the snapshot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's have the link in the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@hanabi1224
Copy link
Contributor Author

Do we have good coverage for this feature? E.g., scenarios with multiple CAR files?

@LesnyRumcajs I think CI tests should have covered most of the important scenarios. As for multiple CAR, it should have been covered by ManyCar tests but I can extend the CI check to load multiple CARs from both vendors we support if that's preferrable, what do you think?

@LesnyRumcajs
Copy link
Member

Do we have good coverage for this feature? E.g., scenarios with multiple CAR files?

@LesnyRumcajs I think CI tests should have covered most of the important scenarios. As for multiple CAR, it should have been covered by ManyCar tests but I can extend the CI check to load multiple CARs from both vendors we support if that's preferrable, what do you think?

It wouldn't hurt to have such integration test.

Another thing that may be interesting to test (if I understand the properties of this feature correctly) is to add an artificially crafted CAR with a single entry (that would normally not exist in the current blockchain) and see if we can retrieve it via e.g., forest-cli chain read-obj.

@hanabi1224
Copy link
Contributor Author

Another thing that may be interesting to test (if I understand the properties of this feature correctly) is to add an artificially crafted CAR with a single entry (that would normally not exist in the current blockchain) and see if we can retrieve it via e.g., forest-cli chain read-obj.

@LesnyRumcajs How about loading test-snapshots/chain4.car? Its root CID does not exist in a calibnet DB and that would test both multi-CAR and read-obj

@hanabi1224 hanabi1224 force-pushed the hm/use-snapshot-file-as-db branch from d1c5c0a to 8d6e339 Compare August 31, 2023 11:10
@hanabi1224
Copy link
Contributor Author

@LesnyRumcajs forest-cli chain read-obj test added

@hanabi1224 hanabi1224 added this pull request to the merge queue Aug 31, 2023
Merged via the queue into main with commit e9ce087 Aug 31, 2023
@hanabi1224 hanabi1224 deleted the hm/use-snapshot-file-as-db branch August 31, 2023 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants