Skip to content

Commit

Permalink
Reworking how errmsg is treated in device code to remove some preproc…
Browse files Browse the repository at this point in the history
…essor variable

substitutions through the use of new local variables.

The changes in this commit affect 3 main areas of module_sf_mynn.F90:
1.) Subroutine SFCLAY_mynn
2.) Subroutine SFCLAY1D_mynn
3.) Subroutine GFS_zt_wat
Each of these areas are described in more detail below.

1.) SFCLAY_mynn

In the SFCLAY_mynn subroutine, it was possible to remove all #ifdef
substitutions of errmsg(len=*) for errmsg(len=200) because errmsg is not used in
any code regions of this subroutine that may be run on an accelerator device.
Since this is the case, errmsg(len=*) is perfectly acceptable, and can be left
alone. The OpenACC data statements within the subroutine were also updated to
remove references to errmsg as well since, again, it was not necessary to have
errmsg on the device for this subroutine.

2.) SFCLAY1D_mynn

- Creation of device_errmsg and device_errflg and proper syncing with errmsg
  and errflg

In the SFCLAY1D_mynn subroutine, it was also possible to remove all #ifdef
substitutions by instead creating a new local variable called device_errmsg
that is a copy of errmsg but with a fixed size of 512 such that it is acceptable
for use on the device. This is necessary because at certain points in the
subroutine, loops that are good to be offloaded to the device set errmsg under
certain conditions. Since these areas cannot be isolated from the parent loop
without a major rework of the loop, we must preserve a way for errmsg to be set
on the device. Since device_errmsg is a fixed size, we can do that. However,
this complicates the code a bit for error handling purposes as we now have
errmsg and device_errmsg which must be synced properly to ensure error messages
are returned to CCPP as expected. Therefore, we must keep track of when
device_errmsg is set so we can know to sync device_errmsg with errmsg. This is
done by making a new local variable called device_errflg to be device_errmsg's
complement on the device as errflg is errmsg's complement on the host. When
device_errflg is set to a nonzero integer, we then know that device_errmsg must
be synced with errmsg. This is simple to do at the end of the subroutine after
the device_errmsg on the device is copyout-ed by OpenACC, and a new IF-block
has been added for this general case.

- Special case of mid-loop return (line 1417), and the creation of
  device_special_errflg and device_special_errmsg

However, there is a special case we must handle a bit differently. In the
mid-loop return statement near line 1417, we also must perform this sync to
ensure the proper errmsg is returned in the event this return is needed.
Therefore, a similar IF-block has been created within the corresponding #ifdef
near line 2027 to ensure errmsg has the proper value before the subroutine
returns. However, since this block is in the middle of the entire code and
only executed on the host, we must first perform an OpenACC sync operation
to make sure the device_errmsg and the device_errflg on the host matches the
device_errmsg and device_errflg on the host, otherwise the incorrect values
could lead to the return statement not executing as expected.

This special case seems simple, but an extra trap lay exposed. If
device_errmsg and device_errflg is set on the device at any point now before
this IF-block, then the return statement we moved out of the loop will now
be executed for *ANY* error message, whether that was the intended course or
not. Therefore, we need to ensure this special case is only triggered for
this specific case. Unfortunately, there appears no other way than to create
two additional variables (device_special_errmsg and device_special_errflg)
to isolate this case from all other error cases. With these installed in
place of just device_errmsg and device_errflg, this special return case is
now properly handled.

- Complete Ifdef/Ifndef removal not possible

Overall, due to the nature of this special case, we have no choice but to
leave the #ifdef and #ifndef preprocessor statements in place as they are
the only things capable of moving this return statement out of the loop
without additional invasive changes to how the code operates.

3.) GFS_zt_wat

In the GFS_zt_wat subroutine, since this subroutine is called on the
device from within the main I-loop of SFCLAY1D_mynn, we have no choice but
to change all errmsg and errflg usage to device_errmsg or device_errflg,
otherwise this subroutine and the entire parent loop could not be run on
the device. Therefore, all errmsg and errflg lines have been commented out
and new, comparable lines using device_errmsg and device_errflg added in
their place. Additionally, the subroutine call variable list was updated.
  • Loading branch information
timsliwinski-noaa committed Aug 24, 2023
1 parent 9c342a5 commit 95e9ff2
Showing 1 changed file with 67 additions and 32 deletions.
99 changes: 67 additions & 32 deletions physics/module_sf_mynn.F90
Original file line number Diff line number Diff line change
Expand Up @@ -345,12 +345,7 @@ SUBROUTINE SFCLAY_mynn( &
& qsfc_wat, qsfc_lnd, qsfc_ice

! CCPP error handling
#ifndef _OPENACC
character(len=*), intent(inout) :: errmsg
#else
!Necessary since OpenACC does not support assumed-size arrays
character(len=200), intent(inout) :: errmsg
#endif
integer, intent(inout) :: errflg

!ADDITIONAL OUTPUT
Expand Down Expand Up @@ -378,7 +373,7 @@ SUBROUTINE SFCLAY_mynn( &
errmsg = ''

!$acc enter data copyin( dz8w,U3D,V3D,QV3D,QC3D,P3D,T3D, &
!$acc pattern_spp_sfc, errmsg)
!$acc pattern_spp_sfc)

!$acc enter data copyin( UST_WAT(:), UST_LND(:), UST_ICE(:), &
!$acc MOL(:), QFLX(:), HFLX(:), &
Expand Down Expand Up @@ -504,7 +499,7 @@ SUBROUTINE SFCLAY_mynn( &
!$acc exit data copyout( UST_WAT(:), UST_LND(:), UST_ICE(:), &
!$acc MOL(:), QFLX(:), HFLX(:), &
!$acc QSFC(:), QSFC_WAT(:), QSFC_LND(:), &
!$acc QSFC_ICE(:), errmsg)
!$acc QSFC_ICE(:))

!$acc exit data delete( dz8w1d(:), dz2w1d(:), U1D(:), &
!$acc V1D(:), U1D2(:), V1D2(:), &
Expand Down Expand Up @@ -666,14 +661,25 @@ SUBROUTINE SFCLAY1D_mynn(flag_iter, &
!JOE-end

! CCPP error handling
#ifndef _OPENACC
character(len=*), intent(inout) :: errmsg
#else
! Necessary since OpenACC does not support assumed-size arrays
character(len=200), intent(inout) :: errmsg
#endif
integer, intent(inout) :: errflg

! Local fixed-size errmsg character array for error messages on accelerator
! devices distinct from the host (e.g. GPUs). Necessary since OpenACC does
! not support assumed-size (len=*) arrays like errmsg. Additional
! device_errflg integer to denote when device_errmsg needs to be synced
! with errmsg.
character(len=512) :: device_errmsg
integer :: device_errflg

! Special versions of the fixed-size errmsg character array for error messages
! on the device and it's errflag counterpart. These are necessary to ensure
! the return statements at lines 1417 and 2030 are executed only for this
! special case, and not any and all error messages set on the device.
character(len=512) :: device_special_errmsg
integer :: device_special_errflg


!----------------------------------------------------------------
! LOCAL VARS
!----------------------------------------------------------------
Expand Down Expand Up @@ -723,6 +729,10 @@ SUBROUTINE SFCLAY1D_mynn(flag_iter, &
! Initialize error-handling
errflg = 0
errmsg = ''
device_errflg = errflg
device_errmsg = errmsg
device_special_errflg = errflg
device_special_errmsg = errmsg
!-------------------------------------------------------------------
!$acc update device(psim_stab, psim_unstab, psih_stab, psih_unstab)

Expand Down Expand Up @@ -770,7 +780,9 @@ SUBROUTINE SFCLAY1D_mynn(flag_iter, &
!$acc PSIT_lnd, PSIT_wat, PSIT_ice, &
!$acc ch_lnd, ch_wat, ch_ice, &
!$acc cm_lnd, cm_wat, cm_ice, &
!$acc snowh_lnd, errmsg)
!$acc snowh_lnd, &
!$acc device_errmsg, device_errflg, &
!$acc device_special_errmsg, device_special_errflg)

!$acc parallel loop present(PSFCPA, PSFC, QSFC, T1D, flag_iter, &
!$acc QSFC_wat, QSFCMR_wat, wet, TSK_wat, tskin_wat, &
Expand Down Expand Up @@ -1198,7 +1210,9 @@ SUBROUTINE SFCLAY1D_mynn(flag_iter, &
!--------------------------------------------------------------------
!--------------------------------------------------------------------

!$acc parallel loop present(flag_iter, errmsg, &
!$acc parallel loop present(flag_iter, &
!$acc device_errmsg, device_errflg, &
!$acc device_special_errmsg, device_special_errflg, &
!$acc wet, dry, icy, &
!$acc ZT_wat, ZT_lnd, ZT_ice, &
!$acc ZNT_wat, ZNT_lnd, ZNT_ice, &
Expand Down Expand Up @@ -1330,7 +1344,7 @@ SUBROUTINE SFCLAY1D_mynn(flag_iter, &
ENDIF
ELSEIF ( ISFTCFLX .EQ. 4 ) THEN
!GFS zt formulation
CALL GFS_zt_wat(ZT_wat(i),ZNTstoch_wat(i),restar,WSPD(i),ZA(i),sfc_z0_type,errmsg,errflg)
CALL GFS_zt_wat(ZT_wat(i),ZNTstoch_wat(i),restar,WSPD(i),ZA(i),sfc_z0_type,device_errmsg,device_errflg)
ZQ_wat(i)=ZT_wat(i)
ENDIF
ELSE
Expand Down Expand Up @@ -1392,10 +1406,14 @@ SUBROUTINE SFCLAY1D_mynn(flag_iter, &
ELSEIF ( IZ0TLND .EQ. 2 ) THEN
! DH note - at this point, qstar is either not initialized
! or initialized to zero, but certainly not set correctly
errmsg = 'Logic error: qstar is not set correctly when calling Yang_2008'
errflg = 1
device_special_errmsg = 'Logic error: qstar is not set correctly when calling Yang_2008'
device_special_errflg = 1
#ifndef _OPENACC
! Necessary since OpenACC does not support branching in parallel code
! Must sync errmsg and errflg with device_errmsg and device_errflg, respectively
! so that proper error message and error flag codes are returned.
errmsg = device_special_errmsg
errflg = device_special_errflg
return
#endif
CALL Yang_2008(ZNTSTOCH_lnd(i),ZT_lnd(i),ZQ_lnd(i),UST_lnd(i),MOL(I),&
Expand Down Expand Up @@ -2001,8 +2019,14 @@ SUBROUTINE SFCLAY1D_mynn(flag_iter, &
ENDDO ! end i-loop

#ifdef _OPENACC
! Necessary since OpenACC does not support branching in parallel code
IF (errflg == 1) THEN
! Necessary since OpenACC does not support branching in parallel code.
! Must sync host errflg, errmsg to determine if return must be triggered
! and correct error message and error flag code returned.
! This code is being executed on the HOST side only, pulling data from DEVICE.
!$acc exit data copyout(device_special_errflg, device_special_errmsg)
IF (device_special_errflg /= 0) THEN
errflg = device_special_errflg
errmsg = device_special_errmsg
return
ENDIF
#endif
Expand Down Expand Up @@ -2506,7 +2530,13 @@ SUBROUTINE SFCLAY1D_mynn(flag_iter, &
!$acc PSIT_lnd, PSIT_wat, PSIT_ice, &
!$acc ch_lnd, ch_wat, ch_ice, &
!$acc cm_lnd, cm_wat, cm_ice, &
!$acc errmsg)
!$acc device_errmsg, device_errflg)

! Final sync of device and host error flags and messages
IF (device_errflg /= 0) THEN
errflg = device_errflg
errmsg = device_errmsg
ENDIF

!$acc exit data delete( flag_iter, dry, wet, icy, dx, &
!$acc MAVAIL, PBLH, PSFCPA, z0pert, ztpert, &
Expand Down Expand Up @@ -3036,24 +3066,27 @@ SUBROUTINE GFS_z0_wat(z0rl_wat,ustar_wat,WSPD,z1,sfc_z0_type,redrag)
END SUBROUTINE GFS_z0_wat
!--------------------------------------------------------------------
!>\ingroup mynn_sfc
SUBROUTINE GFS_zt_wat(ztmax,z0rl_wat,restar,WSPD,z1,sfc_z0_type,errmsg,errflg)
SUBROUTINE GFS_zt_wat(ztmax,z0rl_wat,restar,WSPD,z1,sfc_z0_type,device_errmsg,device_errflg)
!$acc routine seq
real(kind_phys), INTENT(OUT) :: ztmax
real(kind_phys), INTENT(IN) :: wspd,z1,z0rl_wat,restar
INTEGER, INTENT(IN) :: sfc_z0_type
#ifndef _OPENACC
character(len=*), intent(out) :: errmsg
#else
! Necessary since OpenACC does not support assumed-size arrays
character(len=200), intent(out) :: errmsg
#endif
integer, intent(out) :: errflg

! Using device_errmsg and device_errflg rather than the CCPP errmsg and errflg
! so that this subroutine can be run on an accelerator device with OpenACC.
! character(len=*), intent(out) :: errmsg
! integer, intent(out) :: errflg
character(len=512), intent(out) :: device_errmsg
integer, intent(out) :: device_errflg

real(kind_phys) :: z0,z0max,wind10m,rat,ustar_wat
real(kind_phys), PARAMETER :: charnock = 0.014, z0s_max=.317e-2

! Initialize error-handling
errflg = 0
errmsg = ''
! errflg = 0
! errmsg = ''
device_errflg = 0
device_errmsg = ''

! z0 = 0.01 * z0rl_wat
!Already converted to meters in the wrapper
Expand Down Expand Up @@ -3084,8 +3117,10 @@ SUBROUTINE GFS_zt_wat(ztmax,z0rl_wat,restar,WSPD,z1,sfc_z0_type,errmsg,errflg)
call znot_t_v7(wind10m, ztmax) ! 10-m wind,m/s, ztmax(m)
else if (sfc_z0_type > 0) then
write(0,*)'no option for sfc_z0_type=',sfc_z0_type
errflg = 1
errmsg = 'ERROR(GFS_zt_wat): sfc_z0_type not valid.'
! errflg = 1
! errmsg = 'ERROR(GFS_zt_wat): sfc_z0_type not valid.'
device_errflg = 1
device_errmsg = 'ERROR(GFS_zt_wat): sfc_z0_type not valid.'
return

endif
Expand Down

0 comments on commit 95e9ff2

Please sign in to comment.