Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write-component hangs for lambert_conformal regional grids in the southern hemisphere #838

Closed
gsketefian opened this issue Sep 28, 2021 · 6 comments · Fixed by NOAA-EMC/fv3atm#497 or #1087
Labels
bug Something isn't working

Comments

@gsketefian
Copy link

Description

Specifying a regional lambert_conformal write-component grid in the southern hemisphere causes the forecast model to hang.

To Reproduce

Run a custom domain/grid the UFS SRW App that is in the southern hemisphere and uses a lambert_conformal write-component grid. Here is an example of a custom grid in the SRW App's config.sh file that fails (it is over Peru):

#
# Define custom grid.
#
GRID_GEN_METHOD="ESGgrid"

ESGgrid_LON_CTR="-75.0"
ESGgrid_LAT_CTR="-12.5"
ESGgrid_DELX="3000.0"
ESGgrid_DELY="3000.0"
ESGgrid_NX="940"
ESGgrid_NY="940"
ESGgrid_WIDE_HALO_WIDTH="6"

DT_ATMOS="45"
LAYOUT_X="20"
LAYOUT_Y="20"
BLOCKSIZE="32"

QUILTING="TRUE"
WRTCMP_write_groups="1"
WRTCMP_write_tasks_per_group=$(( 1*LAYOUT_Y ))
WRTCMP_output_grid="lambert_conformal"
WRTCMP_cen_lon="${ESGgrid_LON_CTR}"
WRTCMP_cen_lat="${ESGgrid_LAT_CTR}"
WRTCMP_stdlat1="${ESGgrid_LAT_CTR}"
WRTCMP_stdlat2="${ESGgrid_LAT_CTR}"
WRTCMP_nx="900"
WRTCMP_ny="900"
WRTCMP_lon_lwr_left="-88.0"
WRTCMP_lat_lwr_left="-24.0"
WRTCMP_dx="${ESGgrid_DELX}"
WRTCMP_dy="${ESGgrid_DELY}"

Running with this grid causes the ufs_model executable to hang until the run_fcst task in the SRW App runs out of wallclock time.

Tests Conducted

I tried running short forecasts (6 hours; using FV3GFS data for ICs/LBCs; all on Hera) that use lambert_conformal write-component grids on the following domains/regions:

  1. US Southwest
  2. Central Asia
  3. Northern Pacific
  4. Peru
  5. Indian Ocean
  6. New Zealand

Cases 1-3 (in the northern hemisphere) worked with the original code, but cases 4-6 (in the southern hemisphere) failed. These tests can be found on Hera in the directories

/scratch2/BMC/det/Gerard.Ketefian/UFS_CAM/TEST_develop_custom_grids/expt_dirs/orig_code/${EXPT_SUBDIR}

where EXPT_SUBDIR can have one of the following six values:

custom_grid_ESG_03km_WRTCMP_LambertConf_CentralAsia
custom_grid_ESG_03km_WRTCMP_LambertConf_IndianOcean
custom_grid_ESG_03km_WRTCMP_LambertConf_NewZealand
custom_grid_ESG_03km_WRTCMP_LambertConf_NorthernPacific
custom_grid_ESG_03km_WRTCMP_LambertConf_Peru
custom_grid_ESG_03km_WRTCMP_LambertConf_USSouthWest

The log file for the run_fcst task (which calls the ufs_model executable) is at log/run_fcst_2019061500.log under each experiment directory. The 3 failed cases just hang during the forecast.

Possible Solution

I tried changing the Lambert conformal transformation formulas used in the write-component (in module_wrt_grid_comp.F90) to ones I use in other codes I have (obtained here). I then retried the 6 cases above. All were successful, confirming that the formulas used may have a bug (at least for the southern hemisphere). The six successful runs with the modified formulas can be found on Hera in the directories:

/scratch2/BMC/det/Gerard.Ketefian/UFS_CAM/TEST_develop_custom_grids/expt_dirs/new_code/${EXPT_SUBDIR}

The specific modifications I made to the formulas are as follows (where module_wrt_grid_comp.F90.orig is the original file and module_wrt_grid_comp.F90.new is the file with my changes to the Lambert conformal formulas):

$ diff module_wrt_grid_comp.F90.orig module_wrt_grid_comp.F90.new
3638c3638
<             dlon=modulo(glon-c_lon+180+3600,360.)-180.D0
---
>             dlon=glon-c_lon
3644,3645c3644,3645
<             rho = sqrt(x*x+y*y)
<             theta=atan2(x,y)
---
>             rho = sign(1.0,en)*sqrt(x*x+y*y)
>             theta=atan(x/y)
3647,3649c3647
<             glon=modulo(glon+180+3600,360.)-180.D0
< !            glat=(2.0*atan((a*f/rho)**(1.0/en))-0.5*pi)*rtod
<             glat=(0.5*pi-2.0*atan((rho/(a*f))**(1.0/en)))*rtod
---
>             glat=(2.0*atan((a*f/rho)**(1.0/en))-0.5*pi)*rtod
@gsketefian gsketefian added the bug Something isn't working label Sep 28, 2021
@DusanJovic-NOAA
Copy link
Collaborator

@gsketefian Please take a look at changes in this branch https://github.com/DusanJovic-NOAA/fv3atm/tree/lambert_sh

@gsketefian
Copy link
Author

@DusanJovic-NOAA I left a couple of comments and a question in that branch.

Also, just wondering if you've run any tests yet on regional grids. I was starting work to create a PR for this as well since there's an SRW release coming up soon, but I ran into issues running the rt.sh tests and was trying to debug those. I'm happy to have you take over from here.

@DusanJovic-NOAA
Copy link
Collaborator

I ran full rt.sh on Hera, all tests pass except 4 tests that create history outputs on Lambert grid, but the differences are expected ( roundoff errors ). I also ran one regional run with the domain over southern hemisphere, see this run directory:
/scratch2/NCEPDEV/fv3-cam/Dusan.Jovic/sufs/simple-ufs/run/model_run

@DusanJovic-NOAA
Copy link
Collaborator

@DusanJovic-NOAA I left a couple of comments and a question in that branch.

Also, just wondering if you've run any tests yet on regional grids. I was starting work to create a PR for this as well since there's an SRW release coming up soon, but I ran into issues running the rt.sh tests and was trying to debug those. I'm happy to have you take over from here.

I fixed to formula used to compute rho.

@gsketefian
Copy link
Author

@DusanJovic-NOAA Ok, thanks. When you think the code is finalized, please let me know. I'd like to run with your branch the 6 cases in the SRW App that I tried when I first created this issue.

@DusanJovic-NOAA
Copy link
Collaborator

@DusanJovic-NOAA Ok, thanks. When you think the code is finalized, please let me know. I'd like to run with your branch the 6 cases in the SRW App that I tried when I first created this issue.

I think you can run the SRW tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants