You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The data files can be found here: https://noaadata.apps.nsidc.org/NOAA/G02202_V4/north/monthly/. The example code below crashes randomly: the file processed when the crash occurs differs between runs. This happens only when threads_per_worker is > 1 in the client() call . n_workers does not matter, at least I could not make it to crash. The traceback points to hdf5.
What did you expect to happen?
No response
Minimal Complete Verifiable Example
frompathlibimportPathimportpandasaspdfromdask.distributedimportClientimportxarrayasxrclient=Client(n_workers=1, threads_per_worker=4)
DATADIR=Path("/mnt/sdc1/icec/NSIDC")
year=2020times=pd.date_range(f"{year}-01-01", f"{year}-12-01", freq="MS", name="time")
paths= [
DATADIR/"monthly"/f"seaice_conc_monthly_nh_{t.strftime('%Y%m')}_f17_v04r00.nc"fortintimes
]
forninrange(10):
ds=xr.open_mfdataset(
paths,
combine="nested",
concat_dim="tdim",
parallel=True,
engine="netcdf4",
)
deldsHDF5-DIAG: ErrordetectedinHDF5 (1.14.0) thread0:
#000: H5G.c line 442 in H5Gopen2(): unable to synchronously open groupmajor: Symboltableminor: Unabletocreatefile#001: H5G.c line 399 in H5G__open_api_common(): can't set object access argumentsmajor: Symboltableminor: Can'tsetvalue#002: H5VLint.c line 2669 in H5VL_setup_acc_args(): invalid location identifiermajor: Invalidargumentstoroutineminor: Inappropriate type
#003: H5VLint.c line 1787 in H5VL_vol_object(): invalid identifier type to functionmajor: Invalidargumentstoroutineminor: Inappropriate type
HDF5-DIAG: ErrordetectedinHDF5 (1.14.0) thread0:
#000: H5G.c line 887 in H5Gclose(): not a group IDmajor: Invalidargumentstoroutineminor: Inappropriate type
2023-07-1600:35:47,833-distributed.worker-WARNING-ComputeFailedKey: open_dataset-09a155bb-5079-406a-83c4-737933c409c7Function: execute_taskargs: ((<functionapplyat0x7f0001edf520>, <functionopen_datasetat0x7effe3e35c60>, ['/mnt/sdc1/icec/NSIDC/monthly/seaice_conc_monthly_nh_202001_f17_v04r00.nc'], (<class'dict'>, [['engine', 'netcdf4'], ['chunks', (<class'dict'>, [])]])))
kwargs: {}
Exception: "OSError(-101, 'NetCDF: HDF error')"2023-07-1600:35:47,834-distributed.worker-WARNING-ComputeFailedKey: open_dataset-14e239f4-7e16-4891-a350-b55979d4a754Function: execute_taskargs: ((<functionapplyat0x7f0001edf520>, <functionopen_datasetat0x7effe3e35c60>, ['/mnt/sdc1/icec/NSIDC/monthly/seaice_conc_monthly_nh_202011_f17_v04r00.nc'], (<class'dict'>, [['engine', 'netcdf4'], ['chunks', (<class'dict'>, [])]])))
kwargs: {}
Exception: "OSError(-101, 'NetCDF: HDF error')"---------------------------------------------------------------------------OSErrorTraceback (mostrecentcalllast)
CellIn[1], line1914paths= [
15DATADIR/"monthly"/f"seaice_conc_monthly_nh_{t.strftime('%Y%m')}_f17_v04r00.nc"16fortintimes17 ]
18forninrange(10):
--->19ds=xr.open_mfdataset(
20paths,
21combine="nested",
22concat_dim="tdim",
23parallel=True,
24engine="netcdf4",
25 )
26deldsFile~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/api.py:1050, inopen_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, **kwargs)
1045datasets= [preprocess(ds) fordsindatasets]
1047ifparallel:
1048# calling compute here will return the datasets/file_objs lists,1049# the underlying datasets will still be stored as dask arrays->1050datasets, closers=dask.compute(datasets, closers)
1052# Combine all datasets, closing them in case of a ValueError1053try:
File~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/api.py:570, inopen_dataset()
558decoders=_resolve_decoders_kwargs(
559decode_cf,
560open_backend_dataset_parameters=backend.open_dataset_parameters,
(...)
566decode_coords=decode_coords,
567 )
569overwrite_encoded_chunks=kwargs.pop("overwrite_encoded_chunks", None)
-->570backend_ds=backend.open_dataset(
571filename_or_obj,
572drop_variables=drop_variables,
573**decoders,
574**kwargs,
575 )
576ds=_dataset_from_backend_dataset(
577backend_ds,
578filename_or_obj,
(...)
588**kwargs,
589 )
590returndsFile~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:590, inopen_dataset()
569defopen_dataset( # type: ignore[override] # allow LSP violation, not supporting **kwargs570self,
571filename_or_obj: str|os.PathLike[Any] |BufferedIOBase|AbstractDataStore,
(...)
587autoclose=False,
588 ) ->Dataset:
589filename_or_obj=_normalize_path(filename_or_obj)
-->590store=NetCDF4DataStore.open(
591filename_or_obj,
592mode=mode,
593format=format,
594group=group,
595clobber=clobber,
596diskless=diskless,
597persist=persist,
598lock=lock,
599autoclose=autoclose,
600 )
602store_entrypoint=StoreBackendEntrypoint()
603withclose_on_error(store):
File~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:391, inopen()
385kwargs=dict(
386clobber=clobber, diskless=diskless, persist=persist, format=format387 )
388manager=CachingFileManager(
389netCDF4.Dataset, filename, mode=mode, kwargs=kwargs390 )
-->391returncls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
File~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:338, in__init__()
336self._group=group337self._mode=mode-->338self.format=self.ds.data_model339self._filename=self.ds.filepath()
340self.is_remote=is_remote_uri(self._filename)
File~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:400, inds()
398 @property399defds(self):
-->400returnself._acquire()
File~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:394, in_acquire()
393def_acquire(self, needs_lock=True):
-->394withself._manager.acquire_context(needs_lock) asroot:
395ds=_nc4_require_group(root, self._group, self._mode)
396returndsFile~/mambaforge/envs/icec/lib/python3.10/contextlib.py:135, in__enter__()
133delself.args, self.kwds, self.func134try:
-->135returnnext(self.gen)
136exceptStopIteration:
137raiseRuntimeError("generator didn't yield") fromNoneFile~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/file_manager.py:199, inacquire_context()
196 @contextlib.contextmanager197defacquire_context(self, needs_lock=True):
198"""Context manager for acquiring a file."""-->199file, cached=self._acquire_with_cache_info(needs_lock)
200try:
201yieldfileFile~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/file_manager.py:217, in_acquire_with_cache_info()
215kwargs=kwargs.copy()
216kwargs["mode"] =self._mode-->217file=self._opener(*self._args, **kwargs)
218ifself._mode=="w":
219# ensure file doesn't get overridden when opened again220self._mode="a"Filesrc/netCDF4/_netCDF4.pyx:2464, innetCDF4._netCDF4.Dataset.__init__()
Filesrc/netCDF4/_netCDF4.pyx:2027, innetCDF4._netCDF4._ensure_nc_success()
OSError: [Errno-101] NetCDF: HDFerror: '/mnt/sdc1/icec/NSIDC/monthly/seaice_conc_monthly_nh_202011_f17_v04r00.nc'
MVCE confirmation
Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
Complete example — the example is self-contained, including all data and the text of any traceback.
Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
No response
Anything else we need to know?
No response
Environment
INSTALLED VERSIONS
commit: None
python: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 6.1.38-1-MANJARO
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.0
libnetcdf: 4.9.2
This seems to be related to #2494 and Unidata/netcdf4-python#844. Unfortunately the latter is still open. Setting parallel=False works for me. It is not an xarray problem, so I am closing the issue.
What happened?
The data files can be found here: https://noaadata.apps.nsidc.org/NOAA/G02202_V4/north/monthly/. The example code below crashes randomly: the file processed when the crash occurs differs between runs. This happens only when
threads_per_worker
is > 1 in theclient()
call .n_workers
does not matter, at least I could not make it to crash. The traceback points to hdf5.What did you expect to happen?
No response
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
No response
Anything else we need to know?
No response
Environment
INSTALLED VERSIONS
commit: None
python: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 6.1.38-1-MANJARO
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.0
libnetcdf: 4.9.2
xarray: 2023.6.0
pandas: 2.0.3
numpy: 1.24.4
scipy: 1.11.1
netCDF4: 1.6.4
pydap: None
h5netcdf: None
h5py: 3.9.0
Nio: None
zarr: 2.15.0
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: None
dask: 2023.7.0
distributed: 2023.7.0
matplotlib: 3.7.1
cartopy: 0.21.1
seaborn: None
numbagg: None
fsspec: 2023.6.0
cupy: None
pint: None
sparse: 0.14.0
flox: None
numpy_groupies: None
setuptools: 68.0.0
pip: 23.2
conda: None
pytest: None
mypy: None
IPython: 8.14.0
sphinx: None
The text was updated successfully, but these errors were encountered: