Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds incremental verification to CAR files. #3

Merged
merged 9 commits into from
Sep 6, 2023

Conversation

guanzo
Copy link
Collaborator

@guanzo guanzo commented Aug 25, 2023

Rough first pass at incremental verification.

The intended usage is to transform a CAR stream into a "flat file" stream, while incrementally verifying each CAR block against the expected traversal order. This is immediately useful for the service worker.

ipfs-unixfs-exporter is used as the traversal client which will dictate the correct order of blocks. Current assumptions are depth-first-search and that the CAR file will contain duplicate blocks. Each time the client asks for a block, we read the next block from the CAR stream and check if it matches the requested block.

I'm sure there's lots of cases that are not being handled atm, feel free to point them out.

Recommended spots to review are extractVerifiedContent and car-block-getter.js

export async function * extractVerifiedContent (cidPath, carStream) {
const getter = await CarBlockGetter.fromStream(carStream)

for await (const child of recursive(cidPath, getter)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's another annoying thing that you're going to need to test for - you can't assume all your data will be unixfs, and I'm not sure if recursive() here is going to be happy when you feed it non-unixfs data.

When building out your test suite for this, you need to include some other DAG forms that are not unixfs. Some set of dag-cbor blocks that you can path through and verify that it works transparently of whether you're dealing with unixfs or not. The annoying subtlety is that we've first-classed unixfs in this protocol, but allowed for non-unixfs DAGs. So your pathing mechanisms need to change depending on what you're faced with. recursive here should be fine with the unixfs special-cased pathing rules, but when you have non-unixfs data, you need to follow standard ipld pathing through the nodes of each block. Some details on the differences are covered here.

Thankfully, in most cases, non-unixfs pathing is just a naive ordered walk through the JavaScript forms to find the links to follow, like the various operations in https://github.com/multiformats/js-multiformats/blob/master/src/block.js that take a path (specifically get might do the trick).

Copy link
Collaborator Author

@guanzo guanzo Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the tyranny of unixfs that I've been hearing about 😆

Is there an existing lib I should be using, besides the js-multiformats you linked? Maybe helia? All I need to do here is walk the dag and export the content blocks, right? I feel like that generic logic should exist somewhere in the js ipfs ecosystem.

How do I check if a CID is unixfs or not?

Copy link
Collaborator Author

@guanzo guanzo Sep 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recursive seems to work fine with non-unixfs dag-cbor CARs. I tested it with this CAR: https://github.com/ipfs/gateway-conformance/blob/main/fixtures/path_gateway_dag/dag-cbor-traversal.car.

That's a non-unixfs DAG.. right?

In fact I don't think the dag-cbor logic even implements unixfs, it looks like a basic traversal implementation. There's no logic that handles directories or files. https://github.com/ipfs/js-ipfs-unixfs/blob/master/packages/ipfs-unixfs-exporter/src/resolvers/dag-cbor.ts

@reidlw reidlw assigned reidlw and guanzo and unassigned reidlw Sep 1, 2023
@guanzo guanzo force-pushed the incremental-verification branch from 19cc98b to 7cb6999 Compare September 5, 2023 23:28
@guanzo guanzo merged commit 54588f2 into main Sep 6, 2023
@guanzo
Copy link
Collaborator Author

guanzo commented Sep 6, 2023

I want to merge this to begin using it ASAP. Tracking missing features here #4

@guanzo guanzo deleted the incremental-verification branch September 6, 2023 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants