Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add array order processes #65
feat: add array order processes #65
Changes from 11 commits
3e8ff59
caf5e1e
e451a52
10af7e7
9b4fb2d
cf947f4
7595e61
5266066
388c62a
c1bc839
aebf9ca
5f72ef0
e907521
19123cf
43c62cf
c8c07a3
647bb66
8b36d14
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only works with numerical types, but will fail if trying to sort string or datetime arrays!
Why not just use
np.flip
on the found indices?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had used flip before, but switched to this approach because of the following side effect:
Using the example in the documentation, order(data = [6,-1,2,null,7,4,null,8,3,9,9], asc = false, nodata = false) should result in [3, 6, 9, 10, 7, 4, 0, 5, 8, 2, 1].
With np.flip our result becomes [6, 3, 10, 9, 7, 4, 0, 5, 8, 2, 1], so the values that are the same are in the opposite order. Switching these after flip was used does not seem too convenient to me, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for this - hmm, what we could do would be to count the number of nans before, only flip the array[index_last_nan:] and stitch together again? Then we should have strings/datetimes covered too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you are referring to would result in an array like [3, 6, 10, 9, 7, 4, 0, 5, 8, 2, 1], so the nan values would be handled. If there are values in the data twice, (like the "9" in the example), the order of these is also flipped. (The result in the example should be [3, 6, 9, 10, 7, 4, 0, 5, 8, 2, 1])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. @m-mohr what do you think about relaxing the
Ties will be left in their original ordering
requirement in the process spec fororder
? Is this strictly necessary for some usecase? Seems like implementation on our end would be simpler if this didn't have to be guaranteed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get comparable and reproducible results, you can't let this undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that's fair, so we cannot drop this requirement completely - what about changing it such that ties will be in the original ordering if
asc=True
and flipped otherwise?Otherwise I currently fail to see a way to do this such that arrays of strings or datetimes can also be ordered, we'd have to throw an error in these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LukeWeidenwalker That's a good point, I think we can clarify that as I think that was the original intention anyway. Please open an issue in openeo-processes and I'll prepare a PR for 2.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened Open-EO/openeo-processes#409
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't it make sense to also allow len(order.shape) == 1 here? From the examples in the definition https://processes.openeo.org/#rearrange this could also be valid and I think it makes sense to always take the same element of an array over a specific dimension... this could be handle with np.take automatically :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm - the spec seems ambiguous about that to me, since all the examples have an equal number of axes. We could add extra axes and broadcast the
order
array such that it fits thedata
array. I agree that it's a possible enhancement, but adding this would need more tests for higher-dimensional arrays. I think we can simplify our lives here and just throw an error saying that this isn't supported until someone actually needs exactly this?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless of what type the elements are, the array is just re-arranged according to the order on the first level.
So if for example you have
rearrange(data = [[2,3], [4,3]], order=[1,0])
, you'd get[[4,3], [2,3]]
. Hope that helps.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this comment - so the
order
array is always expected to only have a single axis?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed. It is of type
array<integer>
so can't be multi-dimensional.