Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the device and inode numbers from the API. #81

Merged
merged 4 commits into from
Jul 20, 2023

Conversation

sunfishcode
Copy link
Member

As discussed here, remove the fields which correspond to st_dev, st_ino, and d_ino in POSIX from the stat and directory entry structs.

  • Device numbers assume the existence of a global device number space, which creates implicit relationships between otherwise unrelated components.

  • Not all filesystem implementations have these numbers. And some that do have these numbers require extra implementation cost to retrieve them.

  • These numbers leak potentially sensitive or identifying information from the underlying filesystem implementation.

In their place, provide two functions, is-same-file and is-same-mountpoint, for explicitly testing whether two handles are the same file or are on the same mountpoint, respectively. This doesn't cover all possible use cases for device and inode numbers, but we can add more functions as need arises.

@sunfishcode sunfishcode mentioned this pull request Jan 13, 2023
@codefromthecrypt
Copy link

q: If this merges, would the behavior change on wasi_snapshot_preview1 in wasi-sdk 17 that triggered the discussion go back to before?

@sunfishcode
Copy link
Member Author

What we might do is change wasi-libc to remove st_dev and st_ino from its struct stat, even though it's still using preview1 which provides that info. And maybe add is_same_file and is_same_mountpoint functions that work using preview1 st_dev/st_ino values, to help start the transition. I expect we can't make this completely painless, but we'll do what we can.

@yamt
Copy link

yamt commented Jan 14, 2023

is-same-file makes sense as wasi has a concept of hard links.

but i'm not sure about is-same-mountpoint.
does wasi have a concept of mount point?

for porting existing apps, i guess a get-file-id style api, which returns a "wasi file id", which a host can compose as (dev_t, ino_t), might be more convenient. unlike ino_t, an implementation can make it fail saying "we don't have id for this file" and in that case the app can assume this file is not same as any other files. how do you think?

@sunfishcode
Copy link
Member Author

sunfishcode commented Jan 14, 2023

is-same-mountpoint is about whether it's possible to do a rename from one to the other, since rename only works within a mountpoint. It's indeed much less interesting than is-same-file, but I added it because it shouldn't be hard to implement, and because if we're going to make this change work, I want to do everything we can, even little things, to avoid as much breakage as we can.

A file id containing a (dev_t, ino_t) would still have the global device number space problem, would still be inefficient and/or awkward to implement on platforms that don't have inodes (eg. mentioned here), and it would still have the information leak problem. Another problem that I don't remember having mentioned before is that Windows' ReFS requires 128 bits for a full file identifier. Another is the the ID for a file can change over time and if we return the ID as integer values, those values could become stale.

This is a big change, to be sure. I didn't initially think this would be a good idea and I still have reservations. I don't know for sure that we can do this without hitting some compatibility constraint that we won't be willing to break. But if we can manage to make this change, it'll make the platform significantly simpler, and make it easier to implement WASI in more interesting ways.

@yamt
Copy link

yamt commented Jan 14, 2023

is-same-mountpoint is about whether it's possible to do a rename from one to the other, since rename only works within a mountpoint. It's indeed much less interesting than is-same-file, but I added it because it shouldn't be hard to implement, and because if we're going to make this change work, I want to do everything we can, even little things, to avoid as much breakage as we can.

i think standards allow rename and hard links among filesystems.
i have seen such implementations.

A file id containing a (dev_t, ino_t) would still have the global device number space problem, would still be inefficient and/or awkward to implement on platforms that don't have inodes (eg. mentioned here), and it would still have the information leak problem.

that's why i mentioned the "this file doesn't have id" case.

Another problem that I don't remember having mentioned before is that Windows' ReFS requires 128 bits for a full file identifier.

we can make "wasi fileid" big enough.

Another is the the ID for a file can change over time and if we return the ID as integer values, those values could become stale.

do you mean the notes on FAT in the page?

This is a big change, to be sure. I didn't initially think this would be a good idea and I still have reservations. I don't know for sure that we can do this without hitting some compatibility constraint that we won't be willing to break. But if we can manage to make this change, it'll make the platform significantly simpler, and make it easier to implement WASI in more interesting ways.

i feel a bit ambivalent too.

@codefromthecrypt
Copy link

@jerbob92 since you are an ultimate end user here, any thoughts on above?

@sunfishcode
Copy link
Member Author

is-same-mountpoint is about whether it's possible to do a rename from one to the other, since rename only works within a mountpoint. It's indeed much less interesting than is-same-file, but I added it because it shouldn't be hard to implement, and because if we're going to make this change work, I want to do everything we can, even little things, to avoid as much breakage as we can.

i think standards allow rename and hard links among filesystems. i have seen such implementations.

It's a good point. And since it wasn't that useful anyway, I've now just removed is-same-mountpoint. If we think of any other predicates that would be useful, we can always add them.

While thinking about this, I also renamed is-same-file to is-same-object, since it works for things that aren't regular files too.

A file id containing a (dev_t, ino_t) would still have the global device number space problem, would still be inefficient and/or awkward to implement on platforms that don't have inodes (eg. mentioned here), and it would still have the information leak problem.

that's why i mentioned the "this file doesn't have id" case.

Ah, right. So we'd only have two of those three problems :-).

Another problem that I don't remember having mentioned before is that Windows' ReFS requires 128 bits for a full file identifier.

we can make "wasi fileid" big enough.

That is true; 128 bits for an ino_t would be a little unwieldy, but it is doable.

Another is the the ID for a file can change over time and if we return the ID as integer values, those values could become stale.

do you mean the notes on FAT in the page?

Yes; I should have linked to #remaks in that page. I've now fixed the link in my comment above.

@codefromthecrypt
Copy link

another aside: wazero team are starting to try to pass the wasi-testsuite, which would be maybe affected by any decision here. Since the test filesystem is a real FS, we can leak the real inodes to pass it. If we get to a point where wasm is rather expensive on account of inodes we may need a flag to allow folks to opt into "extended stat" or something while this topic settles.

$ wazero run -hostlogging filesystem -mount=/Users/adrian/oss/wasi-testsuite/tests/c/testsuite:/ fdopendir-with-access.wasm
==> wasi_snapshot_preview1.fd_prestat_get(fd=3)
<== (prestat={pr_name_len=1},errno=ESUCCESS)
==> wasi_snapshot_preview1.fd_prestat_dir_name(fd=3)
<== (path=/,errno=ESUCCESS)
==> wasi_snapshot_preview1.fd_prestat_get(fd=4)
<== (prestat=,errno=EBADF)
==> wasi_snapshot_preview1.fd_fdstat_get(fd=3)
<== (stat={filetype=DIRECTORY,fdflags=,fs_rights_base=,fs_rights_inheriting=},errno=ESUCCESS)
==> wasi_snapshot_preview1.path_open(fd=3,dirflags=SYMLINK_FOLLOW,path=fs-tests.dir/fopendir.dir,oflags=DIRECTORY,fs_rights_base=,fs_rights_inheriting=,fdflags=)
<== (opened_fd=4,errno=ESUCCESS)
==> wasi_snapshot_preview1.fd_readdir(fd=4,buf=70048,buf_len=4096,cookie=0,result.bufused=70028)
<== errno=ESUCCESS
==> wasi_snapshot_preview1.path_filestat_get(fd=4,flags=,path=file-1)
<== (filestat={filetype=REGULAR_FILE,size=0,mtim=1671602425719797759},errno=ESUCCESS)
==> wasi_snapshot_preview1.path_filestat_get(fd=4,flags=,path=file-0)
<== (filestat={filetype=REGULAR_FILE,size=0,mtim=1671602425719712300},errno=ESUCCESS)
==> wasi_snapshot_preview1.fd_close(fd=4)
<== errno=ESUCCESS
Assertion failed: inodes[0] != inodes[1] (testsuite/fdopendir-with-access.c: main: 47)
error instantiating wasm binary: module[] function[_start] failed: wasm error: unreachable
wasm stack trace:
	.abort()
	.__assert_fail(i32,i32,i32,i32)
	.main(i32,i32) i32
	.__main_void() i32

@sunfishcode sunfishcode force-pushed the sunfishcode/no-dev-ino branch from a4d79c8 to 54ade91 Compare January 18, 2023 16:32
@whentze
Copy link

whentze commented Jan 18, 2023

Another data point: Linux' overlayfs can exhibit changing st_ino and also st_dev for the same filesystem object. You can mitigate this somewhat by mounting with xino=on but that has a (small) performance cost and can also not 100% guarantee stable inodes.

edit: That is to say, because overlayfs might be used by wasi implementations (it is a very popular option for fs virtualization), this is perhaps an argument to not expose these ids to WASI.

@sunfishcode
Copy link
Member Author

From Mastodon, this reply points out that there are applications which use the inode number to supplement their "has this file changed" checks:

To address this use case, I've now added a metadata-hash function to the proposal, which takes a file descriptor and returns a 128-bit hash code of whichever of the dev, inode, mtime, ctime, birth-time, size, and other relevant metadata the host may have. That's not a drop-in replacement for dev/ino, but in theory it should support the same kinds of use cases. And it doesn't require inode numbers, doesn't imply a global namespace, and with a sufficient hash function doesn't leak information.

@bjorn3
Copy link

bjorn3 commented Jan 19, 2023

Another data point: Linux' overlayfs can exhibit changing st_ino and also st_dev for the same filesystem object. You can mitigate this somewhat by mounting with xino=on but that has a (small) performance cost and can also not 100% guarantee stable inodes.

And as yet another data point, Btrfs as mounted over NFS can reuse the same device,inode pairs. Each Btrfs subvolume independently handles inode numbering. While it locally presents a different device number for each subvolume, when mounted over NFS they all get the same device number, thus resulting in potential inode duplication. It is also not allowed to rename across subvolumes. https://lwn.net/Articles/866582/

Has removing hardlinks entirely been considered? Or is that too much of a break with POSIX?

@sunfishcode
Copy link
Member Author

And as yet another data point, Btrfs as mounted over NFS can reuse the same device,inode pairs. Each Btrfs subvolume independently handles inode numbering. While it locally presents a different device number for each subvolume, when mounted over NFS they all get the same device number, thus resulting in potential inode duplication. It is also not allowed to rename across subvolumes. https://lwn.net/Articles/866582/

I do think that there are times where we just have to say to users "if you care about what happens, consider not using NFS" rather than trying to have WASI or WASI applications try to superheroically fix it.

Has removing hardlinks entirely been considered? Or is that too much of a break with POSIX?

I don't recall anyone having proposed this yet. I don't know that we need to prohibit it, but it would be good to at least document that one generally shouldn't depend on it succeeding. I've filed #83 to track this.

@yamt
Copy link

yamt commented Jan 20, 2023

do you mean the notes on FAT in the page?

Yes; I should have linked to #remaks in that page. I've now fixed the link in my comment above.

it's a bit moot as FAT doesn't have hardlinks.

wasi-filesystem.wit.md Outdated Show resolved Hide resolved
## `metadata-hash`
```wit
/// Return a hash of the metadata associated with a filesystem object referred
/// to by a descriptor.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my honest impression is that it belongs to applications.
with the current description, i suspect an application will need to combine it with other properties like timestamp etc by itself anyway.
i feel it's simpler to provide get-fileid-at.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would indeed be simpler, but it'd have the problems discussed above: not all desirable impls have a meaningful fileid, and an exposed fileid leaks host information. Also, a fileid by itself isn't unique, and is often expected to be paired with a deviceid, but that would require a deviceid namespace. The nice thing about metadata-hash is that it avoids assigning any meaning to the number, so it can be computed in different ways depending on what the host implementation has.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, a fileid by itself isn't unique, and is often expected to be paired with a deviceid, but that would require a deviceid namespace.

I was interpeting the proposal as fileid being a combination of the device id and inode number.

The nice thing about metadata-hash is that it avoids assigning any meaning to the number, so it can be computed in different ways depending on what the host implementation has.

A fileid could be a hash of everything that stays fixed for the lifetime of a file too, or assigned through any other consistent scheme.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the fileid is the device id and inode number, then how would it avoid the problems at the top of this issue?

If the fileid is a hash of everything that stays fixed for the lifetime of a file, how is that different from the metadata-hash I just proposed?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the fileid is the device id and inode number, then how would it avoid the problems at the top of this issue?

Hashing them together combined with a secret value specified by the wasi runtime that is unknown to the wasm module (such that brute-forcing a map from hash to input is not possible) would avoid metadata leakage. As for some filesystems not having inodes, that is just as much an issue with metadata-hash as all other metadata can be non-unique between two different files. Also in both cases it would be possible to have the wasi runtime assign the id/hash based on eg the file path.

If the fileid is a hash of everything that stays fixed for the lifetime of a file, how is that different from the metadata-hash I just proposed?

The metadata hash you specified includes the mtime, which can change without changing the identity of the file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now added documentation mentioning that a secret value known only to the implementation may be used.

@sunfishcode sunfishcode force-pushed the sunfishcode/no-dev-ino branch 2 times, most recently from 52a1297 to 6599875 Compare May 15, 2023 20:25
@sunfishcode
Copy link
Member Author

I've now rebased this, including updating it to the new proposal repo format. I believe I've addressed all the feedback above, and that there were no show-stopping concerns. This PR is a tradeoff, and some things will be harder, but I think overall it's the right approach.

I'll leave this open a little longer here in case anyone has any further feedback.

@pchickey
Copy link
Contributor

Style nit: tuple<u64, u64> isn't immediately apparent to me as representing one single hash. Could we instead use a record metadata-hash { upper: u64, lower: u64 } and put the doc comments on the record definition explaining that morally this is a 128 bit hash?

@sunfishcode
Copy link
Member Author

Sure, I've now added a patch for that.

@codefromthecrypt
Copy link

Is there any way to know how this is likely to be used in wasi-libc? For example, the root thing was trying to avoid a fan-out of stat, and how this doesn't turn into a fan-out of hash based metadata fetching.

It is both hard to process the impact from ABI (mentally compile this into the canonical one) and also integration, except folks who know exactly how these signatures will end up and any change in the cardinality of host callbacks. So, if you've already mentally compiled this, can you please share?

@sunfishcode
Copy link
Member Author

Is there any way to know how this is likely to be used in wasi-libc? For example, the root thing was trying to avoid a fan-out of stat, and how this doesn't turn into a fan-out of hash based metadata fetching.

I expect wasi-libc will remove its st_ino and st_dev fields from it stat structure. We may also add an imperfect POSIX compatibility mechanism on top using C macros that allows users to write .st_ino and .st_dev and have them map to fields containing zeros.

It is both hard to process the impact from ABI (mentally compile this into the canonical one) and also integration, except folks who know exactly how these signatures will end up and any change in the cardinality of host callbacks. So, if you've already mentally compiled this, can you please share?

I haven't mentally compiled an ABI for this. When doing API design with Wit, I find it most useful to just think in terms of Wit, not in terms of the various ABIs that may be generated from it.

As discussed [here], remove the fields which correspond to `st_dev`, `st_ino`,
and `d_ino` in POSIX from the stat and directory entry structs.

 - Device numbers assume the existence of a global device number space,
   which creates implicit relationships between otherwise unrelated
   components.

 - Not all filesystem implementations have these numbers. And some that
   do have these numbers require extra implementation cost to retrieve them.

 - These numbers leak potentially sensitive or identifying information from the
   underlying filesystem implementation.

In their place, provide some functions, `is-same-object`,
`metadata-hash`, and `metadata-hash-at`, for explicitly testing whether two
handles are the same file or have the same metadata, respectively. This doesn't
cover all possible use cases for device and inode numbers, but we can
add more functions as need arises.

[here]: #65 (comment)
As discussed [here], remove the fields which correspond to `st_dev`, `st_ino`,
and `d_ino` in POSIX from the stat and directory entry structs.

 - Device numbers assume the existence of a global device number space,
   which creates implicit relationships between otherwise unrelated
   components.

 - Not all filesystem implementations have these numbers. And some that
   do have these numbers require extra implementation cost to retrieve them.

 - These numbers leak potentially sensitive or identifying information from the
   underlying filesystem implementation.

In their place, provide some functions, `is-same-object`,
`metadata-hash`, and `metadata-hash-at`, for explicitly testing whether two
handles are the same file or have the same metadata, respectively. This doesn't
cover all possible use cases for device and inode numbers, but we can
add more functions as need arises.

[here]: #65 (comment)
@sunfishcode sunfishcode force-pushed the sunfishcode/no-dev-ino branch 3 times, most recently from 72ace35 to 1728295 Compare July 20, 2023 18:24
@sunfishcode sunfishcode force-pushed the sunfishcode/no-dev-ino branch from 1728295 to 58e1ea5 Compare July 20, 2023 18:28
@sunfishcode
Copy link
Member Author

Since there seem to be no major outstanding objections, I think it makes sense to merge this. Wasi-filesystem is still in Phase 2, so there's still time to make changes if any concerns come up; please file new issues!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants