Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge drops attributes #3865

Closed
johnomotani opened this issue Mar 17, 2020 · 1 comment · Fixed by #3877
Closed

merge drops attributes #3865

johnomotani opened this issue Mar 17, 2020 · 1 comment · Fixed by #3877

Comments

@johnomotani
Copy link
Contributor

xarray.merge() drops the attrs of Datasets being merged. They should be kept, at least if they are compatible

MCVE Code Sample

# Your code here
import xarray as xr

ds1 = xr.Dataset()
ds1.attrs['a'] = 42
ds2 = xr.Dataset()
ds2.attrs['a'] = 42

merged = xr.merge([ds1, ds2])

print(merged)

the result is

<xarray.Dataset>
Dimensions:  ()
Data variables:
    *empty*

Expected Output

<xarray.Dataset>
Dimensions:  ()
Data variables:
    *empty*
Attributes:
    a:        42

Problem Description

The current behaviour means I have to check and copy attrs to the result of merge by hand, even if the attrs of the inputs were identical or not conflicting.

I'm happy to attempt a PR to fix this.
Proposal (following pattern of compat arguments):

  • add a combine_attrs argument to xarray.merge
  • combine_attrs = 'drop' do not copy attrs (current behaviour)
  • combine_attrs = 'identical' if attrs of all inputs are identical (using dict_equiv) then copy the attrs to the result, otherwise raise an exception
  • combine_attrs = 'no_conflicts' merge the attrs of all inputs, as long as any keys shared by more than one input have the same value (if not raise an exception) [I propose this is the default behaviour]
  • override copy the attrs from the first input, to the result

This proposal should also allow combine_by_coords, etc. to preserve attributes. These should probably also take a combine_attrs argument, which would be passed through to merge.

Versions

Current master of pydata/xarray on 17/3/2020

Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.6.9 (default, Nov 7 2019, 10:44:02) [GCC 8.3.0] python-bits: 64 OS: Linux OS-release: 5.3.0-40-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 libhdf5: 1.10.2 libnetcdf: 4.6.3

xarray: 0.15.0
pandas: 1.0.2
numpy: 1.18.1
scipy: 1.3.0
netCDF4: 1.5.1.2
pydap: None
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.12.0
distributed: None
matplotlib: 3.1.1
cartopy: None
seaborn: None
numbagg: None
setuptools: 45.2.0
pip: 9.0.1
conda: None
pytest: 4.4.1
IPython: 7.8.0
sphinx: 1.8.3

@max-sixty
Copy link
Collaborator

I think that proposal sounds pretty good!

Any other thoughts @pydata/xarray ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants