-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: ipld: restore e2e online deal and retrievals w/ identity CIDs #9259
Conversation
Sorry, I forgot to mention the second commit in here is related but not directly—undoing the identity CID option for unixfs creation on |
b := cidutil.InlineBuilder{ | ||
Builder: prefix, | ||
Limit: 126, | ||
} | ||
return b, nil | ||
return prefix, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we should include this change in this PR; Yes, there were bugs in ID cid handling, but that doesn't negate the fact that it's an useful feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm proposing that we don't do this in Lotus and don't really encourage it for Filecoin, especially for the online deal flow. For two reasons:
- In Filecoin, there's a relationship between a block and the piece it's stored in. This is complicated for identity CIDs and is a problem when we have retrieval modes where you're able to ask for an arbitrary (not necessarily deal root) "payload CID" and have to figure out what piece it's in so in order to make decisions about availability and price. Illustrated by the difficulty in dealing with identity CIDs as roots in feat: handle retrieval queries for unindexed identity payload CIDs go-fil-markets#747 (this is not really a "fix" for that problem, just a patch to make it workable, it'd not be a problem if we didn't have identity CIDs that can fit CIDs) but the same applies with non-root identity CIDs unless we make sure to store them as blocks and index them as blocks, thereby reducing their utility and increasing the complexity on all of the layers we manage. This gets even more complicated when we index the content in a piece and send that to an indexer. Should the indexer ignore them? Does the CID:Piece relationship matter at that level?
- The break-even cost of storing inline CIDs is pretty low with the way we're having to store them. For an online deal, we end up at only 32-bytes break-even and that's without counting the extra CARv2 index that's generated (I think the break-even for SPs with the CARv2 index is just ~20-bytes, far from the 126 max default here).
(I wrote about this in more detail over here)
So, maybe we just avoid it entirely? At least in lotus—we can't stop them entirely but we can set best practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Yeah avoiding those is probably reasonable.
I wonder if we could allow them, but only for leaf nodes, where we wouldn't need to worry about traversal edge-cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but in the current form CidBuilder
only gets the bytes so it'd need to decode them to look for links to determine that: https://github.com/ipfs/go-cidutil/blob/ac549eaa2ca06b13a1e5b37ee2696cd72b6481a7/inline.go#L21-L26, that's probably not very efficient.
Alternatively we could introduce an optional LeafCidBuilder
into DagBuilderHelper
and special-case use it when it's set for leaves: https://github.com/ipfs/go-unixfs/blob/2c23c3ea6fae3ef1b487cfc0c606a4ffc7893676/importer/helpers/dagbuilder.go#L154
Used to work prior to the CARv2 blockstore interface was introduced to back Graphsync retrievals and client retrieval CAR export. When that was done, the StoreIdentityCIDs option was omitted in some key places, and the CARv2 blockstore interface always returned true from Has() for identity CIDs which prevents Graphsync from storing them. Since launch, car.SelectiveCar has been the canonical way to produce CARs for online deals to calculate CommP. This is still being done on the client side in Lotus for online deals, but it's no longer done on the server side, and neither is it done for retrieval export on the client side. Ref: filecoin-project/go-fil-markets#749 Ref: ipld/go-car#332
This leads to all sorts of problems with the full e2e flow, so let's just not do it and not encourage it
@magik6k @jennijuju I know we're dealing with some deprecated components here, but boost is still relying on some parts of lotus this touches, particularly the itest kit, and I'm running into this problem over there! Which is what reminded me I still had this open. This is updated now to latest |
Related Issues
Depends on:
Related:
Proposed Changes
This used to work prior to the CARv2 blockstore interface was introduced to back
Graphsync retrievals and client retrieval CAR export. When that was done, the
StoreIdentityCIDs option was omitted in some key places, and the CARv2
blockstore interface always returned true from Has() for identity CIDs which
prevents Graphsync from storing them.
Since launch, car.SelectiveCar has been the canonical way to produce CARs for
online deals to calculate CommP. This is still being done on the client side
in Lotus for online deals, but it's no longer done on the server side, and
neither is it done for retrieval export on the client side.
Additional Info
As per discussion in ipld/go-car#332, we do have the
option of changing online deals to not include identity CIDs, therefore
ditching use of car.SelectiveCar (replacing with a CARv2 alternative); but that
would be a breaking change—like the blockstore changes were a breaking change,
but that was accidental. There's also other places where car.SelectiveCar
is simulated, e.g. node/impl/client/client.go#outputCAR.
In addition, I think we should probably be documenting how identity CIDs are handled throughout Lotus - for the online deal flow as well as for retrievals (filecoin-project/go-fil-markets#747 is another example of problems with identity CIDs that is only indirectly related to this PR).
Checklist
Before you mark the PR ready for review, please make sure that:
<PR type>: <area>: <change being made>