-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document map
behavior on sparse data structures
#44233
base: master
Are you sure you want to change the base?
Document map
behavior on sparse data structures
#44233
Conversation
Bump -- @JeffBezanson @StefanKarpinski am I correct in assuming that |
I don't think either of us are the ones to address that but yes, the assumption that the mapper is consistent in mapping values to zero is intended and should be documented. |
Perfect; do you know who I should mention (if I should mention anyone)? |
I believe @andreasnoack may be able to help at least by recommending someone who could help more. |
base/abstractarray.jl
Outdated
Note that `f` makes no guarantees about the order in which `f` is called on elements. In | ||
addition, `f` is assumed to be [pure](https://en.wikipedia.org/wiki/Pure_function). Using | ||
an impure function together with `f` can cause bugs when working with some data structures, | ||
e.g. sparse or diagonal arrays, as `f` will only be called once on duplicated elements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be called once on default 0
but if you have duplicates that are non-default zeros it will be called multiple times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be called once on default
0
but if you have duplicates that are non-default zeros it will be called multiple times.
I've edited to say that it "May" call it only once per duplicate element, and clarify that calling map
with an impure function is undefined behavior.
The performance rationale behind the proposal of @ParadaCarleton is clear. However, when we accept it we should update docstrings of In general the issue is that users might have assumed that it is OK to use these functions with non-pure argument. E.g.
and generator equivalent for should not assume purity of |
While I think I'd agree we should have some alternative to I agree that we should avoid making any changes to behavior for now to avoid breaking old code, though. |
100% agreed. The only problem is exactly what you have commented on later - old code might be affected (fortunately we are not changing implementation of anything - only the contract). As you know in DataFrames.jl we identified this issue because users reported to us problems when they used not-pure functions with |
This pull request clarifies that
map
's behavior assumes purity when working with sparse data structures, and that such behavior is not considered a bug, as discussed in this thread. If the core devs clarify that this behavior is intentional andmap
requires purity, I can also add anapply
function which iterates through a given collection and applies a function element-by-element.