-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experiment with interfacing with Dask #8227
Comments
As a first pass this is probably really simple things like removing lines like https://github.com/astropy/astropy/blob/master/astropy/nddata/nddata.py#L235 which cause a dask array to load values into memory just for the repr. |
Just some notes on some initial hacking: one thing we were interested in was whether it's possible to construct a quantity-like object based on dask. Doing this is pretty simple if we construct a quantity object that is just a container (not subclassing anything) for data-like objects: https://github.com/astrofrog/dasktropy/blob/master/quantities/dask_quantity.py Here's an example of usage: https://github.com/astrofrog/dasktropy/blob/master/quantities/demo_dask_quantity.ipynb Unfortunately these kinds of objects can't be passed to e.g. So one question is whether there is any possibility in future of refactoring Quantity such that it doesn't actually inherit from ndarray, but instead wraps it? If we actually re-implemented most public attributes and methods on ndarray, could we minimize any breakage? Note that this then opens up the option of having e.g. masked arrays or other types of array inside Quantity. @mhvk - I'd be interested whether you have any thoughts on this! |
⭐️ 🦄 🌈 ❗️ 👏 😍 👍 Are you able to measure performance improvement at this point or is it too early to do so? |
Well at the moment it's not super useful as not usable in astropy functions, so no point in worrying about performance yet :) |
@astrofrog - I have not directly thought about making Another option that may be worth investigating is whether it is possible make |
I've done some alternative experimentation in spectral-cube (radio-astro-tools/spectral-cube#567), but it's unclear that any of my approaches "work". When I've tested them on other machines (i.e., not my laptop), I've gotten peculiar and unreproducible errors. I'd like to try to understand what it would take to have spectral-cube built on a dask array object from the bottom up. A lot of spectral-cube's machinery would work better in a naturally delayed framework. My main question, which I think is appropriate for this thread but not directly related (maybe it's more in line with sunpy's work), is whether we can get |
Life... finds a way. When frog DNA are involved. We have an astrofrog on the team. See? All good! |
I'm not sure what the progress is on Dask integration so rather than create a new issue, I'll just leave this here. When creating a >>> import numpy as np;import dask.array;import astropy.units as u
>>> foo = np.random.rand(1)
>>> foo_dask = dask.array.from_array(foo)
>>> foo_dask
dask.array<array, shape=(1,), dtype=float64, chunksize=(1,), chunktype=numpy.ndarray>
>>> u.Quantity(foo_dask)
<Quantity [0.06568409]> However, creating a >>> foobar_dask = foo_dask * u.dimensionless_unscaled
>>> foobar_dask
dask.array<mul, shape=(1,), dtype=float64, chunksize=(1,), chunktype=astropy.Quantity> However, this isn't really an ideal solution as the resulting object is a Dask array (that happens to produce a |
@wtbarnes - thanks for the update. Unfortunately, nothing really has changed on this end - as will have been clear from the thread, this is not trivial. What is changing is that numpy is moving forward with providing |
@mhvk @eteq. I'm just seeing this issue for the first time (mentioned in astropy/astropy-project#283). Proposal astropy/astropy-project#269 will outline similar ideas. Nice to see convergent evolution! I believe Proposal astropy/astropy-project#269 solves (all?) of these issues, based on work done in #12921. |
Gheez, I see indeed that my blue-sky #8227 (comment) is now becoming possible... |
Look into the different submodules whether there are blocking issues for adding support for Dask.
cc @Cadair @astrofrog @eteq
The text was updated successfully, but these errors were encountered: