-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression: "ValueError: cannot unstack dimensions that do not have a MultiIndex" when unstacking a MultiIndex #5384
Comments
This does look like a bug, specifically affecting MultiIndexes containing only one Index. The issue seems to be that 4020 non_multi_dims = [
4021 d for d in dims if not isinstance(self.get_index(d), pd.MultiIndex)
4022 ]
4023 if non_multi_dims:
-> 4024 raise ValueError(
4025 "cannot unstack dimensions that do not "
4026 f"have a MultiIndex: {non_multi_dims}"
4027 )
4028
4029 result = self.copy(deep=False)
4030 for dim in dims:
ipdb> self.get_index('c')
Index([(0,)], dtype='object') # <- single index
ipdb> self
<xarray.Dataset>
Dimensions: (c: 1)
Coordinates:
* c (c) MultiIndex # <- multi index
- b (c) int64 0
a (c) int64 0
Data variables:
d (c) int64 0 I'm not sure how common it is for MultiIndexes to have a single index, but we should be general over any number. We'd definitely take a fix for this. |
That makes sense, and it actually pretty much sums up how I encountered this. My code here is a reduction of a function I wrote that was supposed to work with a fairly general array and subset of its dimensions, and I happened to call it with a one-element dimension list.
In principle, I'd be happy to help, but I haven't gone into the |
This has been introduced in #5102. I'm looking into it. |
This should be fixed in #5385. Side note: the example you gave here, i.e., ds = Dataset({'d': DataArray(c.data, dims=['c'])}, coords=c.coords)
ds = ds.unstack(['c']) should probably be depreciated after the index refactoring in Xarray (currently WIP), which aims to decouple the concepts of coordinates vs. indexes. More specifically, indexes shouldn't be passed implicitly via the |
Interesting! Thanks for the heads-up, @benbovy. I'll keep my eye on that. |
I'm not sure if this is a bug or I'm not using
xarray
correctly, but I used to be able to do this without crashing. The new behavior seems to have been introduced some time between 0.16.2 and 0.18.2.What happened:
What you expected to happen:
The code runs without the
ValueError
exception.Minimal Complete Verifiable Example:
Anything else we need to know?:
Here's the full output from the example on 0.18.2:
What confuses me is that the
c
dimension is shown as aMultiIndex
, but it still complains that it doesn't have aMultiIndex
. Directly unstackingds.d
rather than the dataset itself also fails with the same exception.Oddly, it seems to work if I assign the coordinates after constructing the dataset:
With that workaround, or by downgrading to 0.16.2, the example doesn't crash:
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.8.0 (default, Feb 25 2021, 22:10:10)
[GCC 8.4.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-73-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: 0.18.2
pandas: 1.2.4
numpy: 1.20.3
scipy: 1.6.3
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2021.05.0
distributed: None
matplotlib: 3.4.2
cartopy: None
seaborn: None
numbagg: None
pint: 0.17
setuptools: 39.0.1
pip: 21.1.1
conda: None
pytest: 6.2.4
IPython: 7.23.1
sphinx: None
None
The text was updated successfully, but these errors were encountered: