-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MTG-FCI-FDHSI reader is slow, apparently not actually dask-aware #972
Comments
Just a guess, but does this line compute the solar irradiance: satpy/satpy/readers/fci_l1c_fdhsi.py Line 307 in 1fbfda5
Edit: By compute I mean compute the dask array. |
That does seem suspicious, I will look into that. The line doesn't show up as expensive in the profiler though. 79% of the time is spent in Having said that, the workaround for the xarray<0.13 bug should be removed if we can agree on a requirement for xarray>=0.13 |
I am ok with that requirement. Do you see a major difference in performance if you remove that workaround? |
Oh just realized though that it would have to wait for python 2 to be dropped since xarray 0.13 is python 3 only I think. |
I can just make the workaround conditional upon the xarray version. |
It does appear to be dask-aware; at least When reading just 3 files it takes 85 seconds rather than 30 minutes. Removing the workaround seems to actually make things slower. An idea, although xarray no longer segfaults on many open files, it still keeps too many files open which may also slow things down even further. See also the conversation on pytroll slack satpy channel, 2019-11-22 from 13:13 UTC onwards: https://pytroll.slack.com/archives/C0LNH7LMB/p1574432037452600 |
It also uses a lot more RAM than I would expect. Output of
|
If you use dask's ProgressBar with just a few files, do you get a lot of progress bars shown? This would suggest that things are getting computed when they shouldn't be. |
Need to look into dask diagnostics tools. |
Here's an example for dask profiling: import glob
from dask.diagnostics import ResourceProfiler, Profiler, CacheProfiler, visualize
fnames = glob.glob('/path/to/data/*')
with Profiler() as prof, ResourceProfiler() as rprof, CacheProfiler() as cprof:
glbl = Scene(filenames=fnames)
glbl.load([10.8])
lcl = glbl.resample('euro4', resampler="nearest")
lcl.save_datasets(base_dir='/tmp')
visualize([prof, rprof, cprof],
file_path=dt.datetime.utcnow().strftime('/tmp/fci_profile_%Y%m%d_%H%M%S.html'), show=True) |
It is actually dask-aware. The diagnostics that @pnuu suggested applied to the earlier MWE show there are no workers actually working, which should be expected. It also shows a high CPU usage and a linearly increasing memory. So does this fully pytroll-free example:
For For a full disk image, just reading the attributes causes It looks like either xarray or the netCDF library overhead for opening and closing the file leads to large performance losses and memory leaks when using the If we replace the For If we use a plain For |
With For It looks like we need to use |
@gerritholl If you pass If so, that may be the easiest solution. If not, then it may be best to file a bug with the netcdf4-python people to get this conversation started. Maybe we'll find out there is some caching we can turn off or something similar. Edit: You'll have to have the |
I should hope so. Unfortunately, the reader doesn't work at all with the
|
Looks like xarray and the h5netcdf backend are not friends with the test data files:
Resulting in
Will report this to xarray... |
Script used for testing, running with import netCDF4
import h5py
import xarray
import glob
import random
import functools
srcdir = ("/media/nas/x21308/2019_09_MTG_Testdata/"
"/FCI/FDHSI/uncompressed/20170410_RC70")
filenames = glob.glob(srcdir + "/*BODY*.nc")
n = 200
n = 1000
n = 10000
#opener = xarray.open_dataset
#opener = functools.partial(netCDF4.Dataset, mode="r")
#opener = functools.partial(open, mode="r")
#opener = functools.partial(h5py.File, mode="r")
opener = functools.partial(xarray.open_dataset,
engine="h5netcdf", group="/data/vis_08/measured",
chunks=4096)
for _ in range(n):
with opener(random.choice(filenames)) as ds:
pass |
It seems the h5netcdf fails to open the root but can open others. The previous test with For With For When using the h5netcdf backend and workarounding the other aformentioned problem, the full disk "reading but not reading" takes 17 minutes and has a 459 MB peak memory use, significantly down from 40 minutes and 10 GB peak memory use, but still slower than I would like. |
The FCI reader now uses the h5netcdf backend. See the comments on issue 972 for the rationale. The netcdf4 backend has a memory leak and even worse performance than the h5netcdf backend for opening and closing files many times, which the filehandler does. This addresses pytroll#972 although I'm not yet fully satisfied.
Repeating the h5netcdf attempt but reading locally: For For reading all channels from a full disk, we have |
If I use Maybe the |
When reading attributes, do not access the data variable. Accessing the data variable will trigger netcdf_utils to actually open (and close) the netcdf file using xarray, causing unneccessary overhead. Instead, use the fact that the netcdf reader has cached this information. More caching may be needed. This commit is related to pytroll#972.
With d7c5c84 the "full disk do nothing" is now down to 10 minutes (originally 40 minutes, then 17 minutes after swapping the backend). The commit replaces some unnecessary access to data variables causing But the files also store a lot of scalar or other very-low-dimensity information in data variables, and for those, it might be better to cache this information (and perhaps bypass xarray completely reading it), rather than going through the overhead of creating an xarray or xarray-dask object, such as here: |
For two channels, one file, |
In the FCI reader, use the data variable caching implemented in the previous commit. This should address pytroll#972.
With the #981 in progress, the full disk all channels is now down to 4 minutes 32 seconds, peak RAM 432 MB, 78% CPU. |
Down to less than 2 minutes now, with 48% CPU, although RAM has increased again to 2.2 GB due to increased caching. |
Describe the bug
The MTG-FCI-FDHSI reader is slow and accessing files where it shouldn't. Normally the delayed execution using dask should mean that the code below doesn't really do anything, if I understand correctly.
To Reproduce
Expected behavior
I expect that lots of stuff gets scheduled with dask but nothing gets done in practice because there is nowhere where the data are actually accessed.
Actual results
When running this minimum working example, the running takes 35 minutes real, 30 minutes user, using around 90% CPU and uses several gigabytes RAM. I don't understand exactly what it's doing, but there would appear to be room for optimisation.
Excerpt from log output:
Environment Info:
Additional context
The reader was meant to be made dask-aware in #755, but this appears to have been unsuccessful, and dask-awareness is apparently not covered by the unit tests (if that is even possible).
The text was updated successfully, but these errors were encountered: