You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have written a function process_stacked_groupby that stack all but one dimension of a dataset/dataarray and perform groupby-apply-combine on the stacked dimension. However, after upgrading to 0.15.1, the function cease to work.
MCVE Code Sample
importxarrayasxr# DimensionsN=xr.DataArray(np.arange(100), dims='N', name='N')
reps=xr.DataArray(np.arange(5), dims='reps', name='reps')
horizon=xr.DataArray([1, -1], dims='horizon', name='horizon')
horizon.attrs= {'long_name': 'Horizonal', 'units': 'H'}
vertical=xr.DataArray(np.arange(1, 4), dims='vertical', name='vertical')
vertical.attrs= {'long_name': 'Vertical', 'units': 'V'}
# Variablesx=xr.DataArray(np.random.randn(len(N), len(reps), len(horizon), len(vertical)),
dims=['N', 'reps', 'horizon', 'vertical'],
name='x')
y=x*0.1y.name='y'# Merge x, ydata=xr.merge([x, y])
# Assign coordsdata=data.assign_coords(reps=reps, vertical=vertical, horizon=horizon)
# Function that stack all but one diensions and groupby over the stacked dimension.defprocess_stacked_groupby(ds, dim, func, *args):
# Function to apply to stacked groupbydefapply_fn(ds, dim, func, *args):
# Get groupby dimgroupby_dim=list(ds.dims)
groupby_dim.remove(dim)
groupby_var=ds[groupby_dim]
# Unstack groupby dimds2=ds.unstack(groupby_dim).squeeze()
# perform functionds3=func(ds2, *args)
# Add mulit-index groupby_var to resultds3= (ds3
.reset_coords(drop=True)
.assign_coords(groupby_var)
.expand_dims(groupby_dim)
)
returnds3# Get list of dimensionsgroupby_dims=list(ds.dims)
# Remove dimension not groupedgroupby_dims.remove(dim)
# Stack all but one dimensionsstack_dim='_'.join(groupby_dims)
ds2=ds.stack({stack_dim: groupby_dims})
# Groupby and applyds2=ds2.groupby(stack_dim, squeeze=False).map(apply_fn, args=(dim, func, *args))
# Unstackds2=ds2.unstack(stack_dim)
# Restore attrsfordimingroupby_dims:
ds2[dim].attrs=ds[dim].attrsreturnds2# Function to apply on groupbydeffn(ds):
returnds# Run groupby with applied functiondata.pipe(process_stacked_groupby, 'N', fn)
Expected Output
Prior to xarray=0.15.0, the above code produce a result that I wanted.
The function should be able to
stack chosen dimensions
groupby the stacked dimension
apply a function on each group
a. The function actually passes along another function with unstacked group coord
b. Add multi-index stacked group coord back to the results of this function
combine the groups
Unstack stacked dimension
Problem Description
After upgrading to 0.15.1, the above code stopped working.
The error occurred at the line
# Unstack
ds2 = ds2.unstack(stack_dim)
with ValueError: cannot unstack dimensions that do not have a MultiIndex: ['horizon_reps_vertical'].
This is on 5th step where the resulting combined object was found not to contain any multi-index.
Somewhere in the 4th step, the combination of groups have lost the multi-index stacked dimension.
Versions
0.15.1
The text was updated successfully, but these errors were encountered:
Sorry for the late reply.
I have been using this function in my projects and as such it is minimum functional.
However, I will try to investigate a simpler example that replicate the issue.
Lastly, perhaps you have a better idea for groupby over multi-dimension without stacking the dimensions?
I have written a function
process_stacked_groupby
that stack all but one dimension of a dataset/dataarray and performgroupby-apply-combine
on the stacked dimension. However, after upgrading to 0.15.1, the function cease to work.MCVE Code Sample
Expected Output
Prior to xarray=0.15.0, the above code produce a result that I wanted.
The function should be able to
a. The function actually passes along another function with unstacked group coord
b. Add multi-index stacked group coord back to the results of this function
Problem Description
After upgrading to 0.15.1, the above code stopped working.
The error occurred at the line
with
ValueError: cannot unstack dimensions that do not have a MultiIndex: ['horizon_reps_vertical']
.This is on 5th step where the resulting combined object was found not to contain any multi-index.
Somewhere in the 4th step, the combination of groups have lost the multi-index stacked dimension.
Versions
0.15.1
The text was updated successfully, but these errors were encountered: