Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Pointer CBOR representation to a kinded union (breaking change) #60

Merged
merged 1 commit into from
Dec 3, 2020

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Aug 11, 2020

From:

type Pointer union {
	&Node "0"
	Bucket "1"
} representation keyed

i.e. {"0": CID} or {"1": [KV...]}

To:

type Pointer union {
	&Node link
	Bucket list
} representation kinded

i.e. CID or [KV...]

Also removes redundant refmt tags.

Diff is hard to see because this bundles unmerged PRs #56 and #59. Only the last commit contains the main change here, currently that's 7597825.

Because TestMalformedHamt has some manually generated CBOR you can see the change in the block layout there and the simplification this introduces.

Closes: #53

From:

	type Pointer union {
		&Node "0"
		Bucket "1"
	} representation keyed

i.e. {"0": CID} or {"1": [KV...]}

To:

	type Pointer union {
		&Node link
		Bucket list
	} representation kinded

i.e. CID or [KV...]

Also removes redundant refmt tags

Closes: https://github.com/ipfs/go-hamt-ipld/issues/53
@Stebalien Stebalien force-pushed the rvagg/pointer-kinded branch from 7597825 to a3efb56 Compare August 17, 2020 19:18
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I'd like a 👍 from @whyrusleeping.

Copy link
Member

@whyrusleeping whyrusleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a bit too late to make breaking changes to the formats, especially for a 'not security critical' reason. I'm supportive of making this change, but it will be quite disruptive to a lot of other groups.

@ZenGround0
Copy link
Contributor

@whyrusleeping since we are likely going to be tuning HAMT widths in actors v3 upgrade and migrating most state tree HAMTs anyway it seems like this is a good time to get this change in. Is the disruption to other groups you predicted above going to be tolerable if we ship this in early 2021?

@rvagg
Copy link
Member Author

rvagg commented Nov 25, 2020

Some notes about size savings from this that I neglected to put in the original notes: Each "pointer" to <thing> (bucket or child node link) is currently surrounded by a {"X":<thing>}, where X is a single character string of "0" or "1". This change removes it to just make it <thing>. The saving is 3 bytes per pointer, either 0xa16130 for "0" or 0xa16131 for "1" - CBOR map signifier + string signifier + character. For a bitWidth of 5, that's 32 x 3 = 96 bytes for a maximally full HAMT node. For HAMTs with non-trivial amounts of data, most of the non-leaf nodes will be maximally full and leaf nodes will average out in the range of ~4 to 96.

Then there's also the minor decode savings of not having to deal with these 2 extra CBOR constructs.

As an interesting aside, we came up with a way to describe both of these forms (before and after this PR) in IPLD schemas which translate directly to go-ipld-prime's ability to handle a HAMT that can read both of them in a single codegen'd form: https://github.com/ipld/specs/pull/338/files

Copy link
Contributor

@ZenGround0 ZenGround0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to speculatively override @whyrusleeping's block on this PR and merge to HAMT v3. If this change doesn't make it past the fip process I will then revert.

@ZenGround0 ZenGround0 merged commit 5cdbb51 into filecoin-project:master Dec 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: change block layout for Pointers
4 participants