-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: sorting for SVectors #754
Conversation
2ca8a3e
to
7a3ceed
Compare
I think it's better to define the sort algorithms for tuples and then just call it via dispatch for static arrays. There are a few packages that have sort for tuples (though I guess there is no package that implements bitonic sorting networks?): There is a PR to add sort for tuples in |
thanks, I totally missed the work ongoing for tuples. |
In #703 (comment), @c42f was open to include StaticNumbers.jl as a dependency. So my guess is that it's OK to rely on one of those packages that can sort tuples. Though ideally it'd be nice if it is in somewhere like https://github.com/JuliaCollections/SortingAlgorithms.jl that can be maintained by multiple people. |
Thank you for the pointers, the addition of bitonic sort to There seem to be a number of approaches to sorting networks for tuples and benchmarking methods differ. Personally I'd be fine with hardcoded networks (as in JuliaLang/julia#32710), but they are even less general or maintainable. Really any simple non-allocating implementation will probably be better than what we have now though. |
Yeah, we need an entry point for sorting immutables. It'd be nice if there is something like |
I don't see why we couldn't host some algorithms for sorting immutable containers here (for now, and switch over to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I kind of like this approach - thanks @stev47.
It would be really good to support StaticVector
instead of SVector
.
I'm not sure what's causing allocations on large vectors. It might be best to put in a size cut-off, like defalg(a::StaticArray) = length(a) < 13 || BitonicSort || SomeOtherSort
, especially if bitonic sort isn't O(N log N) - we have a variety of cutoffs for matrix-matrix multiplication, for example. (We do get the occassional person using StaticArrays for very large arrays, even if only experimentally, and it's nice not to explode the compiler in these cases).
Yes, I do agree with that. I feel like I'd be happy to add dependencies only if they provide some pretty core value proposition. I'm not sure sorting does so, but I'd be happy to host an implementation. This one looks quite neat at a quick glance. |
I'm not against StaticArrays maintaining its own sort algorithm. The benefit I was considering was rather the other direction. The sort algorithm StaticArrays is going to use likely is going to be one of the most well-optimized sorting algorithms for small containers. It'd be nice to able to use it for other small immutable containers. But code duplication is not always bad. It's probably not so hard to copy code from StaticArrays and use it for tuples. |
Thank you all for the valuable feedback. I updated the pull-request with your recommendations and used an internal tuple interface as @tkf suggested so it can easily be reused in other code. |
pushed a new update. |
Looks great, thanks! |
I did a little performance microbenchmarking. The version for
One possible reason is that the integer version seems to compile down to a lot of |
as in my initial post, you may want to try |
Oh of course, that should teach me to reach straight for
|
Haha. Damn IEEE ordering. Sometimes I wonder if we should just give up and have |
the julia intrinsic for #define fpislt_n(c_type, nbits) \
static inline int fpislt##nbits(c_type a, c_type b) JL_NOTSAFEPOINT \
{ \
bits##nbits ua, ub; \
ua.f = a; \
ub.f = b; \
if (!isnan(a) && isnan(b)) \
return 1; \
if (isnan(a) || isnan(b)) \
return 0; \
if (ua.d >= 0 && ua.d < ub.d) \
return 1; \
if (ua.d < 0 && ua.ud > ub.ud) \
return 1; \
return 0; \
} so I guess it is both. You can also compare |
I didn't find allocation-free, fast sorting for small SVectors, so I implemented bitonic sorting networks, which I'd like to contribute back. Sorting networks are static and thus generally make good use of modern processor magic (branch prediction, speculative execution, ...), they also allow for parallel execution.
Implementation notes:
sort(SVector())
seems to consistently outperform the allocation-freesort!(MVector())
Questions:
Float64
, 20 forInt
), can somebody explain this? The generated code is completely static but@llvm_code
shows unrelated allocations.