redundant copying of get_close_caching data in filter_assim #364

hkershaw-brown · 2022-06-23T14:26:26Z

There are a set of arrays in filter_assim:
close_X_dist
last_close_X_dist

DART/assimilation_code/modules/assimilation/assim_tools_mod.f90

Lines 333 to 336 in 52816c2

    
           real(r8), allocatable :: close_obs_dist(:) 
        
           real(r8), allocatable :: close_state_dist(:) 
        
           real(r8), allocatable :: last_close_obs_dist(:) 
        
           real(r8), allocatable :: last_close_state_dist(:)

last_close_X are the arrays to cache the indices and distances from get_close_{obs/state}

similarly for indices:

DART/assimilation_code/modules/assimilation/assim_tools_mod.f90

Lines 352 to 355 in 52816c2

    
           integer, allocatable :: close_obs_ind(:) 
        
           integer, allocatable :: close_state_ind(:) 
        
           integer, allocatable :: last_close_obs_ind(:) 
        
           integer, allocatable :: last_close_state_ind(:)

For large states, this copying of the dist can be very expensive. Here are Ed's (@fnrliu) results from profiling a high res mpas test.

Ed has some code to adaptively turn off caching during run time:
main...fnrliu:adaptive_get_close_cached

However,
Proposal (from Jeff):

Make close_X arrays intent(inout)
Eliminate the explicit caching and copying (last_close_X_dist = close_X_dist) of dist and indices. The logic can be, if we are caching and the current obs location is the same as the previous one (that would need to be saved still), do nothing. If the location is not the same, do the computation to get_close.

The text was updated successfully, but these errors were encountered:

nancycollins · 2022-06-23T17:00:12Z

there are several related issues here - none make it impossible to speed this up, but you do have to be careful to address all of them. they include:

we allow localization distance to depend on the observation type. even if successive locations are the same, if the type isn't the same it's possible the close list won't be. i believe if the user specifies any per-type distances then we turn caching off.
it is allowed (and encouraged) that model_mods which support more than one vertical coordinate (e.g. pressure, height, scale height, level, etc) do any vertical conversion to the localization coordinate in get_close and update the "input" list of locations. e.g. the user selects to localize in scale height. state variables on model levels, or observations in pressure can be converted to scale height the first time they are encountered in the input list. then in subsequent calls to get_close they don't have to convert the vertical again. whatever is done with the caching, if a model_mod updates the input list the changes need to be kept.
the benefit of caching the last get_close lists depends on how expensive it is to do get_close. for something like cam with a regular grid get_close might run fairly fast. for something like mpas with its irregular mesh it can be an expensive search.
the place we often saw a benefit was with 1) a single localization distance, and 2) with NCEP or other obs which were often T,U,V,Q at the same location right in a row. the cache saved 3 get_close searches. if the input file has no obs with repeated locations then the caching is wasted space. not sure how you'd know until it's too late to make the choice.

mjs2369 self-assigned this Jun 30, 2022

mjs2369 mentioned this issue Jul 11, 2022

Redundant caching in assim_tools_mod.f90 #368

Merged

12 tasks

mjs2369 closed this as completed in #368 Jul 14, 2022

hkershaw-brown mentioned this issue Jul 21, 2022

fix: intent(inout) for close_X_dist and close_X_ind #377

Merged

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

redundant copying of get_close_caching data in filter_assim #364

redundant copying of get_close_caching data in filter_assim #364

hkershaw-brown commented Jun 23, 2022

nancycollins commented Jun 23, 2022

redundant copying of get_close_caching data in filter_assim #364

redundant copying of get_close_caching data in filter_assim #364

Comments

hkershaw-brown commented Jun 23, 2022

nancycollins commented Jun 23, 2022