Loading of DNB data from VIIRS compact SDR is slow #940

pnuu · 2019-10-16T10:28:05Z

Describe the bug
Loading of DNB data with viirs_compact reader is slow. Loading of one granule (768 scanlines of 4064 measurements) takes a bit over 7 seconds. As the below snippet demonstrates, everything is done lazily and no dask compute() calls are made.

To Reproduce

import glob
import time
import dask

from satpy import Scene
from pyresample.test.utils import CustomScheduler

fnames = sorted(glob.glob('/home/lahtinep/data/satellite/new/SVDNBC*b41281*'))
glbl = Scene(reader='viirs_compact', filenames=fnames[0:1])
tic = time.time()
# This will raise an exception if any `compute()` calls are made
with dask.config.set(scheduler=CustomScheduler(max_computes=0)):
    glbl.load(['DNB'])
print(time.time() - tic)

Expected behavior
The .load() call should be almost instantaneous.

Actual results
The "lazy" .load() call takes 7 seconds per granule. For hncc_dnb composite this is even worse, 23 s per granule. I ran some profiling and found that satpy.readers.viirs_compact.expand_array() is the one taking the time. This function is called by both VIIRSCompactFileHandler.navigate() and, in addition twice VIIRSCompactFileHandler.angles() for hncc_dnb .

Profiling

Total time: 14.7971 s
File: /home/lahtinep/Software/pytroll/packages/satpy/satpy/readers/viirs_compact.py
Function: expand_array at line 395
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   395                                           @profile
   396                                           def expand_array(data,
   397                                                            scans,
   398                                                            c_align,
   399                                                            c_exp,
   400                                                            scan_size=16,
   401                                                            tpz_size=16,
   402                                                            nties=200,
   403                                                            track_offset=0.5,
   404                                                            scan_offset=0.5):
   405                                               """Expand *data* according to alignment and expansion."""
   406       192       6178.0     32.2      0.0      nties = np.asscalar(nties)
   407       192       2688.0     14.0      0.0      tpz_size = np.asscalar(tpz_size)
   408       192       2164.0     11.3      0.0      s_scan, s_track = da.meshgrid(np.arange(nties * tpz_size),
   409       192     651925.0   3395.4      4.4                                    np.arange(scans * scan_size))
   410       192     546368.0   2845.7      3.7      s_track = (s_track.reshape(scans, scan_size, nties, tpz_size) % scan_size
   411       192    1130793.0   5889.5      7.6                 + track_offset) / scan_size
   412       192     548327.0   2855.9      3.7      s_scan = (s_scan.reshape(scans, scan_size, nties, tpz_size) % tpz_size
   413       192    1081750.0   5634.1      7.3                + scan_offset) / tpz_size
   414                                           
   415       192    2057581.0  10716.6     13.9      a_scan = s_scan + s_scan * (1 - s_scan) * c_exp + s_track * (
   416       192    1958127.0  10198.6     13.2          1 - s_track) * c_align
   417       192        566.0      2.9      0.0      a_track = s_track
   418                                           
   419       192     195067.0   1016.0      1.3      data_a = data[:scans * 2:2, np.newaxis, :-1, np.newaxis]
   420       192     189617.0    987.6      1.3      data_b = data[:scans * 2:2, np.newaxis, 1:, np.newaxis]
   421       192     192634.0   1003.3      1.3      data_c = data[1:scans * 2:2, np.newaxis, 1:, np.newaxis]
   422       192     186189.0    969.7      1.3      data_d = data[1:scans * 2:2, np.newaxis, :-1, np.newaxis]
   423                                           
   424       192     596579.0   3107.2      4.0      fdata = ((1 - a_track)
   425       192    2447401.0  12746.9     16.5               * ((1 - a_scan) * data_a + a_scan * data_b)
   426       192        567.0      3.0      0.0               + a_track
   427       192    2915920.0  15187.1     19.7               * ((1 - a_scan) * data_d + a_scan * data_c))
   428       192      86701.0    451.6      0.6      return fdata.reshape(scans * scan_size, nties * tpz_size)

Environment Info:

OS: Ubuntu 18.04 Linux
Satpy Version: 0.16.2.dev149+g4313bb4f (current master branch )
Dask version: 2.5.2
Xarray version 0.14.0 (FIX, was 0.17.0)

Additional context
I still have old versions of satpy/dask/xarray in operations which seems to do this fast (less than a second for 10 DNB granules). The versions there are:

Satpy version: 0.11.2+11.g7047b8d
Dask version: 1.1.0
Xarray version: 0.11.3

The text was updated successfully, but these errors were encountered:

pnuu · 2019-10-16T10:29:07Z

I'll try downgrading Dask and Xarray few steps to see if that helps.

Ping @djhoese @mraspaud

pnuu · 2019-10-16T10:47:03Z

On my new server I have satpy 0.17.2 from conda-forge, dask 2.5.2, and xarray 0.13.0, and there the loading is practically instantaneous. Now I have locally these same versions, but it's still slow. Weird.

pnuu · 2019-10-16T13:13:29Z

I read the logs too hastily. The loading of DNB granules in this format has always been rathers slow. The logs just weren't clear on when there are calls and when cached values are used. But anyway, @mraspaud speeded up things quite nicely in the linked PR, so this wasn't all for nothing 😂

pnuu added bug component:readers labels Oct 16, 2019

mraspaud mentioned this issue Oct 16, 2019

Speed up cviirs tiepoint interpolation #941

Merged

5 tasks

mraspaud closed this as completed in #941 Oct 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loading of DNB data from VIIRS compact SDR is slow #940

Loading of DNB data from VIIRS compact SDR is slow #940

pnuu commented Oct 16, 2019 •

edited

Loading

pnuu commented Oct 16, 2019

pnuu commented Oct 16, 2019

pnuu commented Oct 16, 2019

Loading of DNB data from VIIRS compact SDR is slow #940

Loading of DNB data from VIIRS compact SDR is slow #940

Comments

pnuu commented Oct 16, 2019 • edited Loading

pnuu commented Oct 16, 2019

pnuu commented Oct 16, 2019

pnuu commented Oct 16, 2019

pnuu commented Oct 16, 2019 •

edited

Loading