Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax requirements regarding FileChannel #681

Open
tbrunsch opened this issue Feb 8, 2025 · 2 comments
Open

Relax requirements regarding FileChannel #681

tbrunsch opened this issue Feb 8, 2025 · 2 comments

Comments

@tbrunsch
Copy link

tbrunsch commented Feb 8, 2025

Current State

jHDF imposes two notable requirements regarding FileSystem implementations:

  1. The constructor HdfFile(Path) assumes that FileSystemProvider.newFileChannel(...) (implicitly called via FileChannel.open(Path, OpenOption...) is implemented and returns a FileChannel.

  2. HdfFileChannel.mapNoOffset(long, long, MapMode) assumes that FileChannel.map(MapMode, long, long) throws an UnsupportedOperationException if it does not support memory mapping.

Both requirements are not met by certain FileSystem implementations:

  1. The default implementation of FileSystemProvider.newFileChannel(...) throws an UnsupportedOperationException. There are FileSystemProvider implementations that don't overwrite this method because they don't/can't provide a FileChannel. The in-memory file system Jimfs provides a way to simulate the absence of FileChannel support (cf. Feature.FILE_CHANNEL).

  2. There are FileChannel implementations that don't support memory mapping and locking. Some of them throw an IOException in map(...), lock(...), and tryLock(...). One example is the S3FileChannel of the de-facto standard NIO file system implementation for the S3 file system.

Proposal

Both requirements are not necessary for the operability of jHDF:

  1. When one cannot obtain a FileChannel from the FileSystemProvider, one has to fall back to working with a SeekableByteChannel returned by FileSystemProvider.newByteChannel(...). A SeekableByteChannel has essentially the same functionality as a FileChannel except that it does not support memory mapping and locking. Although jHDF exploits memory mapping if possible, it has already implemented a fallback for the case that the FileChannel does not support memory mapping (see next point). Hence, in principle jHDF is already ready to work with a SeekableByteChannel. I propose to continue working with FileChannel. However, if the creation of a FileChannel fails, than jHDF should retrieve and work with a SeekableByteChannel. I would further propose to wrap this in a new FileChannel implementation that delegates whatever possible to the SeekableByteChannel, similar to the implementation of the S3FileChannel. That way, jHDF could continue working with FileChannel, which means that the jHDF code needs to be adjusted only at a single place, and file systems with FileChannel implementations would not get any performance penalties (e.g. from instanceof checks or casts).

  2. As already mentioned, jHDF provides a fallback if memory mapping is not supported. However, this fallback is only triggered if the FileSystem.map(...) implementation throws an UnsupportedOperationException. I propose to use the fallback also if that method throws an IOException: This will cover the S3FileChannel use case, but maybe also some other use cases where the IOException has another meaning. However, I think that using the fallback in these cases won't do any harm.

@tbrunsch
Copy link
Author

tbrunsch commented Feb 8, 2025

I have created this pull request containing my proposal. This also contains several changes in the test classes:

  • I have modified NioPathTest to work with Jimfs instead of a ZipFileSystem now. That way I could test that jHDF works with file systems that support FileChannels and with file systems that don't.

  • Not all tests passed locally. The reason was that I had checked out jHDF into a directory whose path contained spaces. All tests that called this.getClass().getResource("...").getFile() failed because the space encoded in the URL ("%20") didn't get decoded back correctly when calling URL.getFile(), at least with my JDK. As a remedy, I have introduced utility methods getTestFile(String), getTestPath(String), and getTestUri(String) in TestUtils to reference test files correctly and consistently.

@jamesmudd
Copy link
Owner

This sounds like a very useful improvement and your proposal sounds sensible.

Thanks for putting together a PR I will take a look as soon as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants