Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A mathematical morphology framework for Imglib2. #2

Merged
merged 25 commits into from
Mar 6, 2015

Conversation

tinevez
Copy link
Contributor

@tinevez tinevez commented Jan 23, 2015

This PR supersedes #1.

This PR ships a small mathematical morphology framework for Imglib2.

Rather than documenting what it does and how to use it, I put documentation and benchmarks on Imglib2 wiki.

Though mathematical morphology extends far beyond what is contained in this PR, I wish to subject it to the team scrutiny early.

For instance, this PR contains several calls to deprecated classes, I did not know how to workaround.

Derived from the Rectangle Shape, but that can be non-square.
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
Shape implementation for neighborhoods that are lines parallel to one of
the image axis.
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
as defined by Jones and Soilles. Periodic lines: Definition, cascades,
and application to granulometries. Pattern Recognition Letters (1996)
vol. 17 (10) pp. 1057-1063
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
Morphological operations greatly benefit from structuring elements
decomposition. This class provides utility methods that generates
classical structuring elements as a List<Shape>. Whenever possible, the
list is made of a decomposition to speed up morphological operations.
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
With a JUnit test.
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
Signed-off-by: Jean-Yves Tinevez <jean-yves.tinevez@pasteur.fr>
@ctrueden
Copy link
Member

ctrueden commented Feb 4, 2015

From my perspective, this PR is awesome and should be merged ASAP. It is well structured, provides many new features and lots of corresponding unit tests. Since the imglib2-algorithm component is still in beta in general, it seems like merging early makes a lot of sense.

@tpietzsch Do you agree?

@tinevez
Copy link
Contributor Author

tinevez commented Feb 5, 2015

Thanks @ctrueden! But arrgh you edited your comment!
If you point me to where the design could be improved, you would give me a chance to improve, which might be a good investment for the project ;)

@tinevez tinevez closed this Feb 5, 2015
@tinevez
Copy link
Contributor Author

tinevez commented Feb 5, 2015

Wrong button sorry.

@tinevez tinevez reopened this Feb 5, 2015
@ctrueden
Copy link
Member

ctrueden commented Feb 6, 2015

arrgh you edited your comment!

I just wanted to clarify that I didn't see any problems with the PR. I was afraid that my previous wording somehow implied that I did.

That said, I did not scrutinize the code in great detail. I thought probably @tpietzsch would be the best person to approve the design.

@tpietzsch
Copy link
Member

I reviewed the code today. Looks great. I will comment in detail tomorrow

@tpietzsch
Copy link
Member

@tinevez Fantastic work! I have no problem to merge this as-is. I have gone through the code pretty thoroughly and I have some suggestions for improving it, respectively consolidating with existing stuff.
However, these can be done before or after merging this into master. In some cases it definitely makes sense to postpone until after the merge. I will just list my comments below and then you can decide whether and which you want to tackle before we merge. We can make issues of the remaining ones.

  1. Static inner classes CenteredRectangleShape.NeighborhoodsIterableInterval and CenteredRectangleShape.NeighborhoodsAccessible are exactly the same as in RectangleShape, so just use these. (I would suggest to address this before the merge.)
  2. RectangleShape and CenteredRectangleShape should be unified and support for non-symmetric neighborhood should be added. This can all be the same class with a bunch of more constructors. (I would suggest to postpone this to after the merge.)
  3. Neighborhoods should all move to a common place. It makes no sense to have some live in net.imglib2.algorithm.region.localneighborhood and some of them in net.imglib2.algorithm.morphology.neighborhoods. How about moving all of them to net.imglib2.algorithm.neighborhood?
  4. The xxxNeighborhoodLocalizableSampler classes share a lot of code. There is potential for unification there. We could make xxxNeighborhoodFactory implementations non-static and have them contain all the parameters required to construct the neighborhoods. Then there could be only one NeighborhoodFactory and one generic typed NeighborhoodLocalizableSampler that is used everywhere. (I would suggest to postpone this to after the merge.)
  5. The xxxShape.NeighborhoodsAccessible.randomAccess( final Interval interval ) should make an effort to compute the interval of the source that is required (i.e., output interval plus border depending on the strel bounding box) and request a source RandomAccess for that required interval. This will allow Views to make better choices of whether it can strip away extensions in transform chains. Also in some cases it is required to actually enforce the extension for extended-and-then-cropped views. See javadoc added in imglib/imglib2@3569f58 for additional explanation. (I would suggest to postpone this until after (4) has been addressed because all required logic is in the RectangleNeighborhoodLocalizableSampler already.)
  6. In Dilation and Erosion: Note, that it is not necessary to use neighborhoodsRandomAccessibleSafe() for the multi-threaded case. This is a misconception in what safe and unsafe mean for neighborhood-accessibles: Also for unsafe neighborhood-accessibles, all RandomAccess<Neighborhood> obtained from the accessible are independent of each other and can safely be used in multithreaded code. The "unsafe-ness" is one level deeper, on the Neighborhood that you get() from the RandomAccess<Neighborhood>. For the unsafe case, Neighborhood.cursor() will always get you the same cursor, so the following is a problem:
    final Cursor< T > ncfirst = neighborhood.cursor();
    ncfirst.fwd();
    final Cursor< T > nc = neighborhood.cursor();
    while ( nc.hasNext() )
    {
        if ( nc.next().compareTo( ncfirst.get() ) < 0 )
        {

        }
    }

Here, you try to use ncfirst as a cursor that stays positioned at the first element of the neighborhood, while nc is a cursor that traverses the neighborhood. For neighborhoods obtained from an unsafe neighborhood-accessible, this is going to go wrong because ncfirst and nc are the same cursor. However cursors obtained from different RandomAccess<Neighborhood> will always be independent, in the safe as well as the unsafe case. You're not doing the problematic thing. So, in dilate() etc it is perfectly fine to use unsafe accessibles also in the multithreaded case. (I would suggest to address this before the merge.)

  1. In Opening and Closing: It would be nice if no intermediate image would need to be allocated, but probably this is not possible because the second operation (dilate-after-erode resp erode-after-dilate) needs pixels beyond the image boundary that the first operation generated? I assume, you thought this through already...
  2. Should Thresholder be in the morphology package? Maybe it should rather move to separate package net.imglib2.algorithm.threshold?
  3. For MorphologyUtils some of the methods near-duplicate existing functionality or might be split out to more general places:
    • createVariable(). There is something similar in net.imglib2.util.Util.getTypeFromInterval().
    • getSuitableFactory(). Util has getArrayOrCellImgFactory() which is similar. Maybe getSuitableFactory() should move to Util as well.
    • copy(), copy2(), copyCropped(). These are of more general use and need not be hidden in MorphologyUtils.
    • size() we have net.imglib2.util.Intervals.numElements() for this already.
  4. Fix thread names in MorphologyUtils.copy() and copy2(). (I would suggest to address this before the merge.)

Finally, a general remark: Maybe the concept of Shape not having a dimensionality was a bit misguided. At the hackathon, with Christian Dietz I already discussed the distinction between Region and Shape, and that maybe this distinction should go away and every Region can besides being an actual Region also be a "Region-Template", so more or less what shape is intented to be. A bit of awkwardness would go away (such as the need for MorphologyUtils.getNeighborhood() if Shape would have a fixed dimensionality). However, it is also useful to have something like a general arbitrary-dimension hypersphere shape. I'm not sure what the best way is. Suggestions are welcome.

@dietzc
Copy link
Member

dietzc commented Feb 8, 2015

also some comments from my side:

In tinevez@9488d66 for example, you pass the number of threads for multi-threading. It would be nicer if you could pass a ExecutorService instead of the number of threads and creating threads on your own. Using an ExecutorService allows the application which actually calls the algorithm to manage the creation of the threads. However, you can introduce helper methods which create an ExecutorService if a user really wants to pass the number of threads (see https://github.com/imglib/imglib2-algorithm/blob/master/src/main/java/net/imglib2/algorithm/gauss3/Gauss3.java for example).

Additionally, in tinevez@9488d66, the name of the threads are still "Erode XY", even if the concept is more general. Maybe you want to change the names of the created threads?

Concerning the "package" names/location of neighborhoods (@tpietzsch, @tinevez):
What about moving Neighborhoods & Co to imglib2-roi or is it too early?

@tinevez
Copy link
Contributor Author

tinevez commented Feb 8, 2015

Hi all

Thank you very much for the time you spent reviewing this. I hope that
where you live the weather on this Sunday is extremely hostile.

1. Static inner classes

CenteredRectangleShape.NeighborhoodsIterableInterval and
CenteredRectangleShape.NeighborhoodsAccessible are exactly the same as in
RectangleShape, so just use these. (I would suggest to address this
before the merge.)

Will do

*2. *RectangleShape and CenteredRectangleShape should be unified and

support for non-symmetric neighborhood should be added. This can all be the
same class with a bunch of more constructors. (I would suggest to postpone
this to after the merge.)

Will do.

*3. *Neighborhoods should all move to a common place. It makes no sense to

have some live in net.imglib2.algorithm.region.localneighborhood and some
of them in net.imglib2.algorithm.morphology.neighborhoods. How about
moving all of them to net.imglib2.algorithm.neighborhood?

Agreed. I will do this *before *merge if this is fine.
My rationale was that I thought people would have little use for a
Neighborhood that iterate through two points, and would instead access
structuring elements through the StructuringElements class. In practice, it
is not up to me to decide so let's move them all to neighborhood. The
'region' subpackage is not useful, let us remove it.

*4. *The xxxNeighborhoodLocalizableSampler classes share a lot of code.
There is potential for unification there. We could make
xxxNeighborhoodFactory implementations non-static and have them contain
all the parameters required to construct the neighborhoods. Then there
could be only one NeighborhoodFactory and one generic typed
NeighborhoodLocalizableSampler that is used everywhere. (I would suggest
to postpone this to after the merge.)

Will do.
Most likely it will take me some time.

*5. *The xxxShape.NeighborhoodsAccessible.randomAccess( final Interval
interval ) should make an effort to compute the interval of the source
that is required (i.e., output interval plus border depending on the strel
bounding box) and request a source RandomAccess for that required
interval. This will allow Views to make better choices of whether it can
strip away extensions in transform chains. Also in some cases it is
required to actually enforce the extension for extended-and-then-cropped
views. See javadoc added in imglib/imglib2@3569f58
imglib/imglib2@3569f58
for additional explanation. (I would suggest to postpone this until after
(4) has been addressed because all required logic is in the
RectangleNeighborhoodLocalizableSampler already.)

Will do.

*6. *In Dilation and Erosion: Note, that it is not necessary to use

neighborhoodsRandomAccessibleSafe() for the multi-threaded case. This is
a misconception in what safe and unsafe mean for neighborhood-accessibles:
Also for unsafe neighborhood-accessibles, all RandomAccess
obtained from the accessible are independent of each other and can safely
be used in multithreaded code. The "unsafe-ness" is one level deeper, on
the Neighborhood that you get() from the RandomAccess. For
the unsafe case, Neighborhood.cursor() will always get you the same
cursor, so the following is a problem:

final Cursor< T > ncfirst = neighborhood.cursor();
ncfirst.fwd();
final Cursor< T > nc = neighborhood.cursor();
while ( nc.hasNext() )
{
    if ( nc.next().compareTo( ncfirst.get() ) < 0 )
    {

    }
}

Here, you try to use ncfirst as a cursor that stays positioned at the
first element of the neighborhood, while nc is a cursor that traverses
the neighborhood. For neighborhoods obtained from an unsafe
neighborhood-accessible, this is going to go wrong because ncfirst and nc
are the same cursor. However cursors obtained from different
RandomAccess will always be independent, in the safe as
well as the unsafe case. You're not doing the problematic thing. So, in
dilate() etc it is perfectly fine to use unsafe accessibles also in the
multithreaded case. (I would suggest to address this before the merge.)

Understood, will do.

*7. *In Opening and Closing: It would be nice if no intermediate image
would need to be allocated, but probably this is not possible because the
second operation (dilate-after-erode resp erode-after-dilate) needs pixels
beyond the image boundary that the first operation generated? I assume, you
thought this through already...

Yes, there is two cases for despair here.

The second operation needs to access the pixels beyond the source image.
Plus, the decomposition of a structuring element (strel) in several ones
does NOT WORK (gives a different results if you do not decompose) if you do
not carry these extended pixels as you iterate through the decomposition.
This problem made me withdraw the PR a first time. This one offers the only
solution I could find.

*8. *Should Thresholder be in the morphology package? Maybe it should
rather move to separate package net.imglib2.algorithm.threshold?

Boarf is it that useful? Please tell me.

    1. *For MorphologyUtils some of the methods near-duplicate existing
      functionality or might be split out to more general places:
    • createVariable(). There is something similar in
      net.imglib2.util.Util.getTypeFromInterval().

Will do.

  • getSuitableFactory(). Util has getArrayOrCellImgFactory() which is
    similar. Maybe getSuitableFactory() should move to Util as well.

I needed something that could handle non-NativeType. If you are fine, I
will move to Util. The problem is that Util belongs to imglib2 core.

  • copy(), copy2(), copyCropped(). These are of more general use and
    need not be hidden in MorphologyUtils.

They are so ugly that I thought I would hide them. I needed the different
kind of copy flavors because sometimes the target is an IterableIterval and
the source a RandomAccessible, or the converse, and so on. The same for the
sub* methods.

  • size() we have net.imglib2.util.Intervals.numElements() for this
    already.

Will do.

*10. *Fix thread names in MorphologyUtils.copy() and copy2(). (I would

suggest to address this before the merge.)

Will do.

Finally, a general remark: Maybe the concept of Shape not having a
dimensionality was a bit misguided. At the hackathon, with Christian Dietz
I already discussed the distinction between Region and Shape, and that
maybe this distinction should go away and every Region can besides being an
actual Region also be a "Region-Template", so more or less what shape is
intented to be. A bit of awkwardness would go away (such as the need for
MorphologyUtils.getNeighborhood() if Shape would have a fixed
dimensionality). However, it is also useful to have something like a
general arbitrary-dimension hypersphere shape. I'm not sure what the best
way is. Suggestions are welcome.

I am in favor of unifying Region and Shape, and give a dimensionality in
Shape.
This is less cool than the actual Shape interface, because, yes, an
arbitrary-dimension shape is Imglib2-esque. But in practice, we have
little data structures in Imglib2 that do not have a numDimensions() method.

Ok, so thank you again, I now have my TODO list done. And I will summarize
it below.

@tinevez
Copy link
Contributor Author

tinevez commented Feb 8, 2015

TODO before merge:

  • Use the same static inner classes in RectangleShape and CenteredRectangleShape.
  • Move all neighborhoods to the net.imglib2.algorithm.neighborhood. (Possibly later: move them all to the imglib2-roi repo).
  • Fix the inadequate use of neighborhoodsRandomAccessibleSafe() for multithreading in the Erosion and Dilation classes.
  • Discuss and decide on the usefulness of a package like net.imgbli2.algorithm.binary that would ship classes useful for the generation or manipulation of BooleanType images.
  • Revamp the MorphologyUtils by relying more on some methods in the Util or 'Intervals` classes.
  • Fix thread names in the copy() and copy2() methods of MorphologyUtils class.
  • Fix thread names in the Thresholder class.

@tinevez
Copy link
Contributor Author

tinevez commented Feb 8, 2015

@dietzc : About the ExecutorService.
I am not convinced if this is the 100% right solution.
I need to know the number of workers the service ships. Of I cannot access it, it is not useful. In the Gauss3 example you cite, it links to SeparableSymmetricConvolution that indeed takes an ExecutorService, but then relies on Runtime.getRuntime().availableProcessors() to get this number. Check https://github.com/imglib/imglib2-algorithm/blob/master/src/main/java/net/imglib2/algorithm/gauss3/SeparableSymmetricConvolution.java#L267

This is the reason why I stick to SimpleMultithreading. But I understand that it cannot last forever....

@dietzc
Copy link
Member

dietzc commented Feb 8, 2015

Have a closer look at the method you are mentoining. This method doesn't use the number of threads at all (just to determine the number of tasks). The ExecutorService is used to create threads and therefore the thread creation can be controlled by the method caller (if desired). SimpleMultithreading does not allow at all that the algorithm caller controls how threads are generated (or from which pool). For example KNIME wants to control the number of and execution of threads on its own, by implementing their own ExecutorService. There are two solutions from my point of view: Use the ExecutorService pattern as in Gauss3 or https://github.com/imglib/imglib2-algorithm/blob/master/src/main/java/net/imglib2/algorithm/dog/DifferenceOfGaussian.java or you use a ThreadService. The latter is not possible, as imglib2-algorithm has no dependency to SciJava (yet?).

About number of threads: actually you only need this to determine an "optimal" number of tasks. The number of total threads is controlled by the ExecutorService.

Signed-off-by: Jean-Yves Tinevez <jeanyves.tinevez@gmail.com>
Signed-off-by: Jean-Yves Tinevez <jeanyves.tinevez@gmail.com>
Signed-off-by: Jean-Yves Tinevez <jeanyves.tinevez@gmail.com>
…ngleShape.

Signed-off-by: Jean-Yves Tinevez <jeanyves.tinevez@gmail.com>
Signed-off-by: Jean-Yves Tinevez <jeanyves.tinevez@gmail.com>
Signed-off-by: Jean-Yves Tinevez <jeanyves.tinevez@gmail.com>
Signed-off-by: Jean-Yves Tinevez <jeanyves.tinevez@gmail.com>
@ctrueden
Copy link
Member

In which situations does the numThreads = Runtime.getRuntime().availableProcessors() heuristic fail?

It fails as soon as your machine is significantly multi-tasking, I think. It is also too greedy, if you know you want to perform two simultaneous tasks: the first task will take 4/4 processors, and the second task will claim only 1 (since that value is never less than 1), for a total of 5/4 processors allocated to the tasks.

It seems crystal clear that we need a way for calling code to explicitly control the multithreading details. At bare minimum, we need the power to say "use X threads", but that alone is still not ideal.

The advice in the javadoc is: "This value may change during a particular invocation of the virtual machine. Applications that are sensitive to the number of available processors should therefore occasionally poll this property and adjust their resource usage appropriately." That approach would mitigate the greedy problem mentioned above. But my intuition is still that it is too limited in scope.

@dietzc
Copy link
Member

dietzc commented Feb 10, 2015

Sure, but which part should be handled by the ExecutorService and which part by the "job-submitter"? I think my problem is, that the "submitter" will never be smarter than the ExecutorService and therefore the "number of tasks" (not threads) only depends on the input data.

@ctrueden
Copy link
Member

I guess it depends how obsessive we want to be about optimizing performance. I have not thought about this problem seriously for hours, so my perspective may be naive. But my current intuition is that it would be ideal to be able to adjust the number of workers for a task after that task has already been launched. Otherwise, when one task completes, its freed threads (freed processors, really) will become underutilized.

If we don't want to worry about that for now, and only want to worry about an API that launches tasks with good numbers of threads, then the ExecutorService approach seems pretty flexible to me. But I feel like I'm the wrong person to decide that—you guys have to seriously try using it and see if/how it falls short.

@tpietzsch
Copy link
Member

@tinevez @ctrueden @dietzc is everybody ok with merging this now?

@tinevez
Copy link
Contributor Author

tinevez commented Feb 23, 2015

@tpietzsch Apart from the discussion on multithreading, it's up to you:
I incorporated the changes you requested pre-merge, and there are still the post-merge changes to make.
I deprecated some old neighborhood, and importantly, I move the Shape interface to a common, sensible, package. This might break things out of this repo, so your opinion on e1460f1 would be good,

@dietzc
Copy link
Member

dietzc commented Feb 23, 2015

I would love to see the ability to set the ExecutorService as in Gauss3, FindMaximum etc.

@tinevez
Copy link
Contributor Author

tinevez commented Feb 23, 2015

@dietzc Could we discuss this in details elsewhere?
I'; loosing hair over the ExecutorService thinige now.

@dietzc
Copy link
Member

dietzc commented Feb 23, 2015

Skype tomorrow?

@tinevez
Copy link
Contributor Author

tinevez commented Feb 23, 2015

@dietzc I need to prepare my rant. Can I confirm later?

@dietzc
Copy link
Member

dietzc commented Feb 23, 2015

Sure.

@dietzc
Copy link
Member

dietzc commented Mar 6, 2015

@tinevez and me agreed that we should merge if this PR if everything beside the multi-threading is fine. I will open a PR with my suggestions afterwards and we can have a dedicated discussion in this PR then. @tpietzsch sounds good?

@tpietzsch
Copy link
Member

@dietzc Sounds good. @tinevez Could you merge or rebase on master such that I can merge the pull request?

@tinevez
Copy link
Contributor Author

tinevez commented Mar 6, 2015

@tpietzsch done.

tpietzsch added a commit that referenced this pull request Mar 6, 2015
A mathematical morphology framework for Imglib2.
@tpietzsch tpietzsch merged commit d476db1 into imglib:master Mar 6, 2015
@tpietzsch
Copy link
Member

@tinevez Thanks a lot! This is really cool stuff.

@dietzc
Copy link
Member

dietzc commented Mar 6, 2015

great. should we release imglib2, imglib2-algorithm & imglib2-roi?

@tinevez
Copy link
Contributor Author

tinevez commented Mar 6, 2015

Thanks to you all, it was a fantastic trip.

@tinevez tinevez deleted the morphology-clean branch March 6, 2015 15:37
@tinevez tinevez mentioned this pull request Mar 6, 2015
3 tasks
@tpietzsch
Copy link
Member

@dietzc done. pretty much everything imglib2 was released. current versions are listed in pom-imglib2 5.2.1 (or rather in its parent pom-imagej 5.12.3)

@dietzc
Copy link
Member

dietzc commented Mar 11, 2015

Thank you! One beer on me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants