Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store dat header and footer blocks as datasets instead of attributes #9

Open
trautmane opened this issue Feb 16, 2023 · 2 comments
Open

Comments

@trautmane
Copy link

I get the following error when trying to convert a "padded" v9 dat:

  File "/Users/trautmane/opt/miniconda3/envs/janelia_emrp/lib/python3.9/site-packages/jeiss_convert/hdf5.py", line 66, in dat_to_hdf5
    g.attrs.update(meta)
  File "/Users/trautmane/opt/miniconda3/envs/janelia_emrp/lib/python3.9/_collections_abc.py", line 941, in update
    self[key] = other[key]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/Users/trautmane/opt/miniconda3/envs/janelia_emrp/lib/python3.9/site-packages/h5py/_hl/attrs.py", line 103, in __setitem__
    self.create(name, data=value)
  File "/Users/trautmane/opt/miniconda3/envs/janelia_emrp/lib/python3.9/site-packages/h5py/_hl/attrs.py", line 196, in create
    attr = h5a.create(self._id, self._e(tempname), htype, space)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5a.pyx", line 50, in h5py.h5a.create
RuntimeError: Unable to create attribute (object header message is too large)

I believe the error occurs because jeiss_convert tries to store the footer as an hdf5 attribute and there is a 64K limit on attribute values. In my prototype hdf5 code, I stored both the footer and header as data sets instead of attributes to work around this issue. I suggest that jeiss_convert does the same.

The "padded" dat files are a special case we discovered at Janelia last year when we happened to get a data set with tiles that were 8250 pixels in height. In a August 17, 2022 email, Shan explained

The YResolution is calculated based on the Y dimension (in µm) and pixel size (in nm). However, it is rounded to multiples of 4 during acquisition due to some issues to synchronize the high speed NI cards. The dat to tif script will exclude those additional lines.

If the padded lines get included in the footer block (which makes sense to me), the footer is too big to store as an hdf5 attribute.

To help with testing, I have uploaded the following two v9 dat files (from different data sets) to HHMI's OneDrive:
Merlin-6284_22-07-15_000050_0-0-0.dat : 96MB, 5000 x 5000, height divisible by 4, works with jeiss_convert
Merlin-6262_22-06-15_155134_0-0-0.dat : 288MB, 9125 x 8250, height NOT divisible by 4, breaks jeiss_convert

I'm not sure if the links are permanent or if they expire at some point - so you may want to download the files sooner rather than later. Let me know if you need me to generate new links.

I'm also guessing that that the padding occurs in v8 as well - but I'm not sure about that.

@clbarnes
Copy link
Owner

Good point, I was tossing up whether to include the header/footer as attributes or datasets. I settled on attributes only because it made it marginally more convenient to iterate through channels; but if they're prepended with _ then they're easy enough to exclude anyway, and this iteration behaviour would be broken by the CSV-derived additional_metadata anyway.

That padding is not something I'd come across, and good to know about. It should be added to the jeiss-specs README. I'm not entirely clear on where the rounding comes in - do the YResolution and XResolution attributes still correctly describe the true image size, with no padding visible if you read ChannelNum * YResolution * XResolution bytes? But the length of the reserved memory block between the end of the header and the start of the recipe is ChannelNum * round_up_to_multiple_of_4(YResolution) * round_up_to_multiple_of_4(XResolution)?

If we do included the padding in the footer, it would also be nice to include the offset into that footer that the recipe starts (which presumably can be calculated with ChannelNum, YResolution, XResolution, and FileLength), just in case anyone has a need/method for decoding it.

clbarnes added a commit that referenced this issue Feb 16, 2023
This fixes a problem with very long footers caused by padded image
channels; see #9.

Also stores the names of present channels as "AINames", see #10
@clbarnes
Copy link
Owner

Fixed in 6f912da

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants