Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_netcdf() fails to write if input is read from netcdf3_classic #2822

Closed
hellkite500 opened this issue Mar 19, 2019 · 6 comments
Closed

to_netcdf() fails to write if input is read from netcdf3_classic #2822

hellkite500 opened this issue Mar 19, 2019 · 6 comments
Labels
plan to close May be closeable, needs more eyeballs

Comments

@hellkite500
Copy link

Code Sample

import xarray as xr
with xr.open_dataset('some_netcdf_classic_file') as ds:
    data = ds
    #do something with data
#Neither of these work, see exception below
data.to_netcdf('some_new_file', 'w')
data.to_netcdf('some_new_file', 'a')
#Writing back to the same file, or a new file with mode of NETCDF3_CLASSIC works
data.to_netcdf('some_netcdf_classic_file', 'a')
data.to_netcdf('some_new_file', 'w', 'NETCDF3_CLASSIC')

Problem description

  File "/home/nels.frazier/.local/lib/python2.7/site-packages/xarray/core/dataset.py", line 1232, in to_netcdf
    compute=compute)
  File "/home/nels.frazier/.local/lib/python2.7/site-packages/xarray/backends/api.py", line 747, in to_netcdf
    unlimited_dims=unlimited_dims)
  File "/home/nels.frazier/.local/lib/python2.7/site-packages/xarray/backends/api.py", line 790, in dump_to_store
    unlimited_dims=unlimited_dims)
  File "/home/nels.frazier/.local/lib/python2.7/site-packages/xarray/backends/common.py", line 263, in store
    self.set_attributes(attributes)
  File "/home/nels.frazier/.local/lib/python2.7/site-packages/xarray/backends/common.py", line 279, in set_attributes
    self.set_attribute(k, v)
  File "/home/nels.frazier/.local/lib/python2.7/site-packages/xarray/backends/netCDF4_.py", line 421, in set_attribute
    _set_nc_attribute(self.ds, key, value)
  File "/home/nels.frazier/.local/lib/python2.7/site-packages/xarray/backends/netCDF4_.py", line 297, in _set_nc_attribute
    obj.setncattr(key, value)
  File "netCDF4/_netCDF4.pyx", line 2619, in netCDF4._netCDF4.Dataset.setncattr
  File "netCDF4/_netCDF4.pyx", line 1479, in netCDF4._netCDF4._set_att
  File "netCDF4/_netCDF4.pyx", line 1745, in netCDF4._netCDF4._ensure_nc_success
AttributeError: NetCDF: String match to name in use

The generated attribute error isn't clear at all, the file actually gets created, but then fails in the processes of writing. There is no clear connection to the netCDF format.

Expected Output

A new file created/written in (default) NETCDF4 format with contents of the Dataset. Or at the very least provide an exception that clearly indicates the format incompatibilities.

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.5 (default, Sep 12 2018, 05:31:16)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]
python-bits: 64
OS: Linux
OS-release: 3.10.0-957.5.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
libhdf5: 1.8.18
libnetcdf: 4.4.1.1

xarray: 0.11.3
pandas: 0.22.0
numpy: 1.14.2
scipy: 1.0.0
netCDF4: 1.4.1
pydap: None
h5netcdf: None
h5py: 2.7.1
Nio: None
zarr: None
cftime: 1.0.1
PseudonetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
cyordereddict: None
dask: None
distributed: None
matplotlib: 1.2.0
cartopy: None
seaborn: None
setuptools: 40.4.3
pip: 18.1
conda: None
pytest: None
IPython: None
sphinx: None

@shoyer
Copy link
Member

shoyer commented Mar 20, 2019

When you leave the with context, the original file is automatically closed.

But something like either of these should work:

with xr.open_dataset('some_netcdf_classic_file') as ds:
    data = ds
    # do something with data
    data.to_netcdf('some_new_file', 'w')

or

with xr.open_dataset('some_netcdf_classic_file') as ds:
    data = ds.load()
#do something with data
data.to_netcdf('some_new_file', 'w')

@hellkite500
Copy link
Author

with xr.open_dataset('some_netcdf_classic_file') as ds:
    data = ds
    # do something with data
    data.to_netcdf('some_new_file', 'w')

or

with xr.open_dataset('some_netcdf_classic_file') as ds:
    data = ds.load()
#do something with data
data.to_netcdf('some_new_file', 'w')

I just tried both of these in a minimal example and I still get the same exception.

@shoyer
Copy link
Member

shoyer commented Mar 20, 2019

OK, in that case please provide a full example that I can run.

@hellkite500
Copy link
Author

minimal_example.tar.gz
Here is a netcdf file and the minimal code to produce AttributeError: NetCDF: String match to name in use

@shoyer
Copy link
Member

shoyer commented Mar 22, 2019

I can reproduce this with libnetcdf 4.6.2.

It looks like this problem is associated with having a _NCProperties attribute on a netCDF3 file:

>>> netCDF4.Dataset('some_netcdf_classic_file_simple.nc')
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF3_CLASSIC data model, file format NETCDF3):
    _NCProperties: version=1|netcdflibversion=4.6.1|hdf5libversion=1.8.20
    Conventions: CF-1.5
    featureType: timeSeries
    NCO: netCDF Operators version 4.7.5 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)
    nco_openmp_thread_number: 1
    dimensions(sizes): feature_id(1)
    variables(dimensions): >i4 Q_TYPE(feature_id), >f4 lat(feature_id), >f4 lon(feature_id)
    groups:

This appears to be fixed in libnetcdf 4.6.3 (Unidata/netcdf-c#803), if you can upgrade to that version.

In the meantime, you can manually delete the _NCProperties attribute, e.g., del data.attrs['_NCProperties']

Out of curiosity, where did this netCDF file come from?

@hellkite500
Copy link
Author

Interesting! I wonder why using the netCDF4 python interface works fine with this scenario. It must not be using the nccopy code.

I will see if I can test a newer netcdf library. This is definitely an edge case, but thanks for providing a workaround!

The netCDF file is an excerpt from a hydrology water modeling input file. Definitely a bit of legacy workflows and files involved.

htmlboss added a commit to DFO-Ocean-Navigator/Ocean-Data-Map-Project that referenced this issue May 27, 2019
* Described here: pydata/xarray#2822 (comment)
* Until we get our conda environment fixed this will have to do.
htmlboss added a commit to DFO-Ocean-Navigator/Ocean-Data-Map-Project that referenced this issue May 27, 2019
* Described here: pydata/xarray#2822 (comment)
* Until we get our conda environment fixed this will have to do.
halehkh65 added a commit to DFO-Ocean-Navigator/Ocean-Data-Map-Project that referenced this issue May 27, 2019
* WIP: Implementing geojson-vt for bathymetry

* Minor change to how geojson is loaded

* Added API extension for PBF tiles

* Updated API script

* Disabled vector rendering before zoom level 7

* Added shape layers for zooms past 7

* Removed old API extension

* Fixed Data types in bath_shapes params

* Fixed extension of bath_shapes route

* Changed API call for high res bathymetry to v1.0

* Disable old tiles for zooms beyond 7

* Chagned directory names to facilitate two demos

* Updated max zoom level

* disabled preloading on shape layers

* Removed preloading, updated API to use mbtiles file

* Moved tile serving to routes_impl and combined requests

* removed demo stuff

* Renamed API request and changed directories to work with flask config

* Fixed styles, added projection to API request preemptively

* Added projection variable + Added documentation

* Vector sources now refresh on state change

* Added description of mbt_impl

* Formatting

* Removed unnecessary files
- deleted geojsons from previous attempts

* Removed unnecessary lines

* Added shape path
- forgot this was important

* Disabled preloading past zoom level 7
- exactly the same as having preloading at infinity for max zoom level 8
- this helps performance at higher zoom levels on lower end machines

* Small clean up
- changed layer_bath url back to the old api so old caches still work

* Ran eslint --ext .jsx,.js src/ --fix

* Unified tileGrids
- Defined as a variable
- Now using MAX_ZOOM[this.props.state.projection]

* Fix some incorrect file paths

* Changed variable names
- Made mbt_imp a bit clearer

* If block optimization
- code more likely to run now runs first

* Update datasetconfig.json (#429)

* Added new calculated variables to supported datasets.
* Switch from navigator domain to localhost (much better performance).

* Workaround for subset failure (#433)

* Described here: pydata/xarray#2822 (comment)
* Until we get our conda environment fixed this will have to do.

* Lat / Lon variable fix for subsetting  (#434)

* Search the superset for lat lon variables
- broke from geoff's code, this should fix it

* Removed unused variable declaration
MartinPontius added a commit to 52North/MariGeoRoute that referenced this issue Oct 23, 2023
by deleting the _NCProperties attribute as mentioned in pydata/xarray#2822
@max-sixty max-sixty added the plan to close May be closeable, needs more eyeballs label Dec 2, 2023
MartinPontius added a commit to 52North/maridatadownloader that referenced this issue Jan 11, 2024
by deleting the _NCProperties attribute as mentioned in pydata/xarray#2822
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to close May be closeable, needs more eyeballs
Projects
None yet
Development

No branches or pull requests

3 participants