Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getitems: support
meta_array
#1131Getitems: support
meta_array
#1131Changes from 10 commits
c268c63
c7023f9
89fa599
f05ee3a
0eed377
e578051
03c97a8
8ba463c
dfa731b
8e753d6
f05edb1
1d2b6ea
f87cb60
eea7466
05be1d4
af54f7e
5513d6f
ad46ccc
40ecba0
0f224d4
61d7a03
81549f5
2bfe68a
4e9c39e
a250f64
02fc80d
61981f2
d0afcde
7e01831
d9838ef
c3ee95f
ec5f396
a1d3520
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Know fsspec-based storage layers are using a
dict
return value, but am wondering if we should be doing something different (like returning an iterable). Asking since this would make the read blocking vs. a bit more lazy. The latter can be useful when working with operations that take a bit longer (like reading from the cloud or parallelizing several reads)cc @martindurant (who also may have thoughts)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of it being blocking, is that many keys may be being fetched concurrently. If you make it an iterator, you lose that, unless you have something that can wait on an async iterator, which in turn has first-completed working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with both of you.
The API specifies a return type of
Mapping[str, Any]
thus it is possible to return a lazy mapping that only readsself[k]
when accessed. But let's do that in a follow up PR?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, and will be interested in how that looks.
By the way, I mentioned elsewhere the possibility of passing key-specific metadata to the get function; that would happen in this same signature. I wonder if you have any use for an array where only some of it is destined for the GPU (perhaps because that data already exists there and doesn't need loading at all).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example of what they would look like, @martindurant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted above, think this discussion gets hairy enough it should be broken out into an issue (likely two) and discussed separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to go the
contexts
route, we don't need ameta_array
argumentThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this also imply that
getitem()
should have acontexts
parameter?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, yes
getitem()
should also take acontexts
parameterThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait a second, there is no
getitem()
and_chunk_getitem()
anymore :)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ for removing
hasattr
hacks in general. 👍There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this so private that no one could have made use of it in a subclass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, AFAICT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's my understanding as well. Or at least by prefixing with
_
we have warned users this is an implementation detail (subject to change) that they shouldn't rely on.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakirkham: this pre-dates me, so I defer. But if it's not documented somewhere we might want to review if that holds across the board. For me,
_x
is typically valid for subclassing by developers, otherwise it would be a__x
. (i.e. public, protected, private in Java-parlance)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a Python thing generally (not specific to Zarr).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single underscore if by far the most common thing to do, to show intent rather than enforce any privateness (which double underscore doesn't do either).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakirkham: see also though https://stackoverflow.com/questions/41761645/when-to-use-one-or-two-underscore-in-python/41761857#comment99492184_41762806