Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple way of computing some label measurement #340

Closed
StRigaud opened this issue Sep 17, 2024 · 5 comments
Closed

multiple way of computing some label measurement #340

StRigaud opened this issue Sep 17, 2024 · 5 comments

Comments

@StRigaud
Copy link
Member

Is it normal that some label measurement (e.g. centroid of label) are computed in different ways depending on which function is used?

User can get this information from the statistics of label approach and from the centroids of labels which rely on different code for computing the coordinates.

Mostlikely that the results is the same (hopefully) but this increase code redondance. Wouldn't it be better to rely all on the statistics function computation? We may loose a bit of speed because we would also compute other measurement not requested but it would make more sense i guess.

Asking this for the pyclesperanto implementation development

@haesleinhuepf
Copy link
Member

Wouldn't it be better to rely all on the statistics function computation?

Can you give an insight performance-wise? I could imagine that the one function is much slower than the other.

@StRigaud
Copy link
Member Author

StRigaud commented Sep 17, 2024

Idk if slower or not tbh, but we can easily imagied that statistics is heavier however it would make a lot of sense to rely on it for centroid computation later used for reduce_labels_to_centroids for example, instead of coming with a second way to compute centroids (which is what is done in the prototype).

very dirty speed test using the prototype on a label blob:

195 ms ± 36.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) // calling statistics
34.8 ms ± 2.06 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) // calling the centroids of labels

We would clearly loose some speed (maybe a bit less in C++ but not certain). But we would consolidate the code and make less stuff for me to develop ...

Edit: Same test but using pyclesperanto with C++, the speed loss is much less

64.3 ms ± 831 μs per loop (mean ± std. dev. of 7 runs, 10 loops each) // calling statistics

I am on my M2 (not sure its relevant)

@thawn
Copy link
Collaborator

thawn commented Sep 18, 2024

How about putting the algorithm that currently calculates the centroid from statistics of label into centroids of label and then calling centroids of label from inside statistics of label? that way, we would have identical results without speed problems?

@haesleinhuepf
Copy link
Member

I think the algorithms are strongly related anyway. @StRigaud you should decide what's most important: Code length or speed/performance. Both are reasonable development goals, and you cannot achieve both commonly.

@StRigaud
Copy link
Member Author

How about putting the algorithm that currently calculates the centroid from statistics of label into centroids of label and then calling centroids of label from inside statistics of label?

That's a way but:

  • it will not speed up the statistics operation as the centroids are computed at the same time as other values on the label
  • it requires to update the statistics function which is quite complex tbh
  • it does not reduce the amount of development I have to do 😭

you should decide what's most important

I am aiming on implementing the missing function for the assistant. So I will focus on making things work, hence reuse the statistics code as it is simple for me right now (no too much extra code and tests, no modification of the statistics function which is complicate, etc.). Nothing stops me later on (e.g. if an issue is raised) to optimise it back (most likely the way @thawn proposed it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants