`cfgrib` loads all chunks into memory when indexing #311

guidocioni · 2022-09-02T11:36:37Z

Related to dask/dask#9451 (and probably to fsspec/kerchunk#198).

When indexing (either sel or isel) over (lat, lon) GRIB files loaded with open_mfdataset (thus containing chunked data) cfrgib attempts to load all chunks into memory. This causes excessive RAM consumption and slow performance.

From the discussion we had the hypothesis is that cfgrib needs to scan the entire file to subset only in few dimensions.
Still, it should be possible not to load the entire dataset into memory when performing the opration.

The text was updated successfully, but these errors were encountered:

matteodefelice · 2023-05-03T07:04:30Z

I'm interested to this too. I am trying to extract a small subset from a ERA5-land file but - independently from the chunk size - xarray/dask tries to read the entire file in memory.

iainrussell · 2023-06-30T15:45:48Z

If I understand the problem correctly, this issue is partly because ecCodes can only read the whole message (field) from disk, even if you only want some meta-data. We have plans to improve that situation, but there is no firm time-frame for it yet. When we do, cfgrib should benefit enormously from it.

guidocioni mentioned this issue Sep 2, 2022

Why does Dask load every chunk into memory when using sel on a xarray Dataset? dask/dask#9451

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`cfgrib` loads all chunks into memory when indexing #311

`cfgrib` loads all chunks into memory when indexing #311

guidocioni commented Sep 2, 2022

matteodefelice commented May 3, 2023

iainrussell commented Jun 30, 2023

cfgrib loads all chunks into memory when indexing #311

cfgrib loads all chunks into memory when indexing #311

Comments

guidocioni commented Sep 2, 2022

matteodefelice commented May 3, 2023

iainrussell commented Jun 30, 2023

`cfgrib` loads all chunks into memory when indexing #311

`cfgrib` loads all chunks into memory when indexing #311