Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating multiple readers with Bio-Formats can be very slow #865

Closed
petebankhead opened this issue Dec 15, 2021 · 1 comment · Fixed by #867
Closed

Generating multiple readers with Bio-Formats can be very slow #865

petebankhead opened this issue Dec 15, 2021 · 1 comment · Fixed by #867
Assignees
Labels
Milestone

Comments

@petebankhead
Copy link
Member

Bug report

Describe the bug
BioFormatsImageServer lazily creates a new reader for each thread as required. This is ok if the readers are fast to initialize, but can be a major bottleneck if they are not.

This was noticed when working with large CZI images (>30 GB). Initializing a reader took ~3-5s. However, because the method was synchronized and 32 threads were wanting to get pixels for the viewer, many other threads were blocked. This meant that even viewing the image properly was not possible for well over a minute.

Once the readers were created, performance was fine.

To Reproduce
Unfortunately, I'm not aware of any public images images that can be used to test this. It might be evident with any large Axioscan images (I'm not entirely sure).

Once the image is open, zoom in and wait for tiles to appear. If experiencing the problem, this will take an unreasonable amount of time. VisualVM indicates that the bottleneck is initializing readers.

Expected behavior
No major delay in requesting tiles once the image has been opened.

Desktop (please complete the following information):

  • OS: macOS (but likely to be all)
  • QuPath Version 0.3.0

Screenshots
Compare the 'total time' (>100s) with the actual time spent using the CPU (3.6s) for a thread requesting image tiles.

Screenshot 2021-12-15 at 12 49 12

Additional context
A few things could help:

  • Reduce the maximum number of tile request threads
  • Reduce synchronization when creating readers
  • Limit the number of readers Bio-Formats can create, independently of the number of threads making tile requests
  • Reduce the calls to isThisType() when creating a reader (when the class of the reader can be known)
@petebankhead petebankhead added this to the v0.3.1 milestone Dec 15, 2021
@petebankhead petebankhead self-assigned this Dec 15, 2021
petebankhead added a commit to petebankhead/qupath that referenced this issue Dec 17, 2021
Aims to fix qupath#865
This takes a different approach to parallelization, managing a pool of ImageReaders with each tile-requesting thread taking the next available reader.
If there are no readers available, and the total number is less than some maximum value (based upon the number if available processors), a new reader is generated on another thread and added to the queue when ready.

This should
* avoid generating more readers than needed, with a limit separate from the number of tile requesting threads
* avoid attempting to initialize multiple readers simultaneously, which can be a bottleneck

In addition, more tests have been added.
@petebankhead
Copy link
Member Author

Upon further investigation, memoization can greatly reduce the severity of the problem - which is probably why it hasn't generated more complaints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant