-
Notifications
You must be signed in to change notification settings - Fork 117
feat(unixfs/mfs): support MaxLinks and MaxHAMTFanout #906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
WithMaxWidth() -------------- The new option sets effectively a maximum width to the directories. Currently, on a dynamic directory, the switching to-from Basic to HAMT is controlled by HAMTShardingSize, and the width of the HAMT shards by DefaultShardWidth. When WithMaxLinks() is specified, the switching is additionally controlled by the number of links exceeding the limit. In that case, MaxLinks is used as ShardWidth. The directory can only be converted back to BasicDirectory when the total number of links is below the limit. Backwards compatibility is kept and tests have been added. Note that when setting MaxLinks to a high number, it is possible that we still suffer automatic conversion to HAMT before hitting MaxLinks, if the estimated directory size is above 256KiB (as before). WithStat() ---------- Allows to set Stat on a new directory. WithCidBuilder() --------------- Allows to set the CidBuilder on a new directory.
Directory metadata like mode or mtime is serialized and added to the dag-pb protobuf, so it is persisted with it and reloaded every time a unixfs folder is loaded in into an MFS directory or cached as a child inside one. With MaxLinks and MaxHAMTWidth we don't have this luxury. The idea is that, if we have an MFS root, it is wrapping an unixfs dynamic directory, and at the same time it is wrapping some protobuf (either basic or hamt). Now, when we use it to operate, add subfolders, flush it, sync it etc, we must still remember the max-links configuration and the only way we can do that is to set it to what the parent has (while leaving Stat alone). This should work when we are adding things to MFS and everything expects the same options. When dealing with a long-lived MFS system and we import existing DAGs into it, this is likely going to cause undesired effects (I'm unsure about the current behavour but it may trigger HAMT conversions where before things were left alone etc.). In any case, previous behavour is kept when the options are zero'ed, which happens unless they are explicitally set (so we should be safe wrt Kubo MFS DAG). Implementation required public setters and getters in unixfs/io.Directory. When adding files, for example it is a normal operation to create a new MFS root. Currently this requires creating an empty dag-pb node configured with a Cid builder etc. and then using it to initialize a new root. The new NewEmptyRoot() method exposes the unixfs directory options at the MFS level. It simplifies making new roots and it lets us pass the unixfs options that we wish, including setting MaxWidth or MaxHAMTFanout.
3b2c298
to
d8ce74d
Compare
Kubo is now happy |
This is probably missing MFS tests that verify that the MaxLinks and MaxHAMTFanout are working. There's a gray area when creating a directory with There is no nice way of dealing with this I think, given we cannot serialize this option, and normal usage means all folders in an mfs dag should be getting the same configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added end-to-end tests to ./test/cli/add_test.go
in ipfs/kubo#10774 and found no issues, works as expected.
Merging, we will include it in next release, and dogfood in Kubo 0.35.0-rc1
This exposes MaxLinks and MAXHAMTFanout options in unixfs directories and MFS.
The idea is to give Kubo the chance to configure these two options when adding.
For this to work it has:
Made the necessary changes to unixfs/io directories. Since basic and HAMT directories can be switched to each other as part of a dynamic directory, we have added the options to each. We have added methods to create new directories as well, with functional options. This is meant to simplify previous usage, where a merkledag-pb folder is created, then wrapped in the unixfs directory implementation, the modified with the right CidBuilder etc... this now can happen in one go with the
New...()
methods.Made the necessary changes to MFS. Unlike mtime and mode, MaxLinks is not serialized in the dag-pb and thus cannot be remembered when writing and reloading a unixfs node (sync/cache refresh etc)... this means we need to set the MaxLinks and MaxHAMTFanout options from the parent in such cases. In such cases, the directories are created from an existing node that is wrapped, so we had to add Setters for the options. To avoid causing side-effects when reading, the Setters do not trigger HAMT conversions etc. even if the loaded folder does not comply with MaxLinks. The new options have been made part of
MkdirOpts
where they fit well.MFS additionally has received a
NewEmptyDirectory
andNewEmptyRoot
methods, which allow setting options for empty directories/roots.All this has been primarily wired into Kubo at ipfs/kubo#10774, and it seems to work.