Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi series and ZCT support #28

Merged
merged 29 commits into from
Apr 21, 2020

Conversation

chris-allan
Copy link
Member

@chris-allan chris-allan commented Apr 20, 2020

Implementation of multi-series and ZCT support in accordance with #25

  • Using a generic data.n5 or similar name for the N5 data
  • Dispensing with the secondary JPEG compressed LABELIMAGE.jpg, <series_no>.jpg, etc. files and encoding all series from the source file in ascending stringified order
  • Expanding the number of dimensions to 5 (X, Y, C, Z, T) following the Bio-Formats declared, and OME-XML recorded, dimension order setting their size to 1 if missing entirely
  • Adding version metadata in a defined location to aid downstream consumers
  • Adding an option to optionally force the dimension order
  • Using data.n5 lacking other dissenting opinions
  • Option to force dimension order is --dimension-order
  • Layout version is currently set to 1 and is available on the root (/) group at the bioformats2raw.layout key

@chris-allan
Copy link
Member Author

@joshmoore: Ready for a quick review pass. Probably needs to have bytewise tests added to ensure the right data is going to the right place now.

Copy link
Member

@melissalinkert melissalinkert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running a simple test with an output directory that hasn't been created:

$ touch "test&sizeZ=20&sizeT=10&sizeC=5.fake"
$ bin/bioformats2raw test\&sizeZ\=20\&sizeT\=10\&sizeC\=5.fake multi-zct-test
2020-04-20 09:40:30,457 [main] INFO  loci.formats.ImageReader - FakeReader initializing test&sizeZ=20&sizeT=10&sizeC=5.fake
Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.glencoesoftware.bioformats2raw.Converter@32502377): java.nio.file.NoSuchFileException: multi-zct-test/METADATA.ome.xml
	at picocli.CommandLine.execute(CommandLine.java:1180)
	at picocli.CommandLine.access$800(CommandLine.java:141)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:1367)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:1335)
	at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:1243)
	at picocli.CommandLine.parseWithHandlers(CommandLine.java:1526)
	at picocli.CommandLine.call(CommandLine.java:1786)
	at picocli.CommandLine.call(CommandLine.java:1710)
	at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:871)
Caused by: java.nio.file.NoSuchFileException: multi-zct-test/METADATA.ome.xml
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
	at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
	at java.nio.file.Files.newOutputStream(Files.java:216)
	at java.nio.file.Files.write(Files.java:3292)
	at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:410)
	at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:322)
	at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:81)
	at picocli.CommandLine.execute(CommandLine.java:1173)
	... 8 more

Running the same test after mkdir multi-zct-test seems to work.

Testing with https://downloads.openmicroscopy.org/images/OME-TIFF/2016-06/tubhiswt-4D/, which has an original dimension order of XYZTC, 2 channels, 10 Z, 43 timepoints:

$ bin/bioformats2raw ~/data/bf-data-repo/automated-tests/curated/ome-tiff/public/2016-06/tubhiswt-4D/tubhiswt_C0_TP0.ome.tif ~/data/tubhiswt-test/
$ cd ~/data/tubhiswt-test
$ xmllint --format METADATA.ome.xml | grep DimensionOrder
    <Pixels BigEndian="true" DimensionOrder="XYZTC" ID="Pixels:0" Interleaved="false" SignificantBits="8" SizeC="2" SizeT="43" SizeX="512" SizeY="512" SizeZ="10" Type="uint8">
$ cd data.n5/0/0
$ cat attributes.json
{"dataType":"uint8","compression":{"type":"blosc","clevel":5,"blocksize":0,"cname":"lz4","nthreads":1,"shuffle":1},"blockSize":[512,512,1,1,1],"dimensions":[512,512,10,2,43]}
$ ls 0/0
0  1  2  3  4  5  6  7  8  9
$ ls 0/0/0
0  1
$ ls 0/0/0/0
0  1  10  11  12  13  14  15  16  17  18  19  2  20  21  22  23  24  25  26  27  28  29  3  30  31  32  33  34  35  36  37  38  39  4  40  41  42  5  6  7  8  9

My concern is that the layout on disk is XYZCT, but the dimension order in the METADATA.ome.xml is XYZTC as expected based upon the original dataset's metadata.

Copy link
Contributor

@joshmoore joshmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Barring the same mkdir issue Melissa saw, the basic layout looks good. I'm looking into nested chunks with the n5-zarr library but that may need to be postponed. I can also look into adding multiscale metadata in this PR or as a follow up.

@chris-allan
Copy link
Member Author

Directory issues should be resolved now. I just have a couple more tests I want to make to ensure that the tile encoding is correct.

@chris-allan
Copy link
Member Author

Handling cases along the right most and bottom most edge where the dimensions are not evenly divisible by the tile size is currently broken. Opened saalfeldlab/n5-zarr#2 upstream as a fix.

/cc @joshmoore, @melissalinkert

@chris-allan
Copy link
Member Author

Dimension orders now preserved with command line option for overriding.

@chris-allan
Copy link
Member Author

Layout version now available in the metadata. Largely now feature complete vs. the specification on #25 and ready for final review before merging.

@chris-allan
Copy link
Member Author

Our version of n5-zarr (0.0.3-SNAPSHOT) now going to repo.glencoesoftware.com, depending on it here and tests are now all enabled.

@melissalinkert
Copy link
Member

All looks much better. It might make sense to save the value of dimensionOrder (if not null) to the METADATA.ome.xml file for consistency, but I realize that might need further discussion so not a blocker to merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants