Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCM Model Failing with MAPL develop #2601

Closed
mathomp4 opened this issue Feb 14, 2024 · 5 comments · Fixed by #2602
Closed

SCM Model Failing with MAPL develop #2601

mathomp4 opened this issue Feb 14, 2024 · 5 comments · Fixed by #2602
Assignees
Labels
🪲 Bugfix This fixes a bug! ❄️ Stale This issue has been marked stale

Comments

@mathomp4
Copy link
Member

mathomp4 commented Feb 14, 2024

My nightly tests are throwing SCM run failures with MAPL develop (and release/MAPL-v3). The first failure was:

 Character Resource Parameter: GWD_INTERNAL_RESTART_FILE:gwd_internal_rst
pe=00000 FAIL at line=00648    FileIOShared.F90                         <num readers must be less than NY>
pe=00000 FAIL at line=10988    MAPL_Generic.F90                         <status=1>
pe=00000 FAIL at line=06159    MAPL_Generic.F90                         <status=1>
 ESMF_StatePrint: (pet 0):
  State name: GWD_INTERNAL
...

I consulted @atrayano (as this line was from #2592) and changed:

       _ASSERT(num_readers < ny,'num readers must be less than NY')

to:

       _ASSERT(num_readers <= ny,'num readers must be less than or equal to NY')

This gets past the first error, but now:

 Character Resource Parameter: VEGDYN_INTERNAL_RESTART_FILE:vegdyn.data
pe=00000 FAIL at line=05991    MAPL_Generic.F90                         <if input file is split must supply num_files>
 ESMF_StatePrint: (pet 0):
  State name: VEGDYN_INTERNAL
     status: Unspecified intent direction, object count: 3
     reconcile needed: T
 Base name    = VEGDYN_INTERNAL
 Status: Base = Ready,  object = Ready
 Proxy        = no
 Root Info (Attributes):
{
  "ESMF": {
    "General": {
      "MAPL_GridTypeBits": 2,
      "MAPL_RestartRequired": 0,
      "MAPL_StateItemOrderList": [
        "ITY",
        "Z2CH",
        "ASCATZ0"
      ],
      "POSITIVE": "down"
    }
  }
}     object: 1,name: ASCATZ0
            type: Field
     object: 2,name: ITY
            type: Field
     object: 3,name: Z2CH
            type: Field
pe=00000 FAIL at line=01600    MAPL_Generic.F90                         <unknown error>
pe=00000 FAIL at line=01106    MAPL_Generic.F90                         <status=-1>
pe=00000 FAIL at line=01829    MAPL_Generic.F90                         <status=-1>
pe=00000 FAIL at line=01255    MAPL_Generic.F90                         <status=-1>
pe=00000 FAIL at line=01099    MAPL_Generic.F90                         <status=-1>
LANDInitialize                                1573

This is new code from #2394 by @bena-nasa . I've consulted him and he said he can take a look.

I've committed the first fix on bugfix/mathomp4/scm-fixes and have a draft PR open at #2602

@mathomp4 mathomp4 added the 🪲 Bugfix This fixes a bug! label Feb 14, 2024
@mathomp4
Copy link
Member Author

As for how to run the SCM. Build GEOSgcm and then go to install/bin and run scm_setup like:

./scm_setup --expdir /path/to/directory/arm_97jul --case arm_97jul --account s1873

Obviously, change the account to whatever you like. Then go to /path/to/directory/arm_97jul and in there you can just run:

./scm_run.j |& tee log.run

By default arm_97jul runs for 30 days, but you can change CAP.rc to be 1 week or whatever. Even 30 days runs quick.

@mathomp4 mathomp4 linked a pull request Feb 14, 2024 that will close this issue
7 tasks
@bena-nasa
Copy link
Collaborator

The issues is that somehow when that binary file vegdyn is loaded as an ESMF HConfig, ESMF is not failing for some reason as saying it could load that as a config which is weird. I can change to just say rather than assuming it is split if the HConfigCreate success to be if the yaml file has the key. But I don't like that HConfig loaded that in the first place without failing. However, this exposed what is probably another error than as been there forever since @mathomp4 nightly testing uses the release build and I was using a debug build:

Mem report            after rad       GEOS_PhysicsGridComp.F90    2701   1.8% :   9.9% Mem Comm:Used
forrtl: severe (408): fort: (33): Shape mismatch: The extent of dimension 3 of array PTR3 is 72 and the corresponding extent of array PTR30 is 73

Image              PC                Routine            Line        Source
GEOSgcm.x          00000000106EA47F  Unknown               Unknown  Unknown
libMAPL.generic.s  000014C7592B0EED  mapl_genericcplco        1122  GenericCplComp.F90
libMAPL.generic.s  000014C7592708F9  mapl_genericcplco         646  GenericCplComp.F90

@mathomp4
Copy link
Member Author

I mean, if I had to guess, it's probably a bug somewhere in Datmodyn. Maybe something that should be 73 in size is only 72? Or vice versa?

Couplers are beyond me at times. Might need @atrayano help on this. I mean, I don't even know what this code really does in MAPL!

Copy link

This issue has been automatically marked as stale because it has not had activity in the last 60 days. If there are no updates within 7 days, it will be closed. You can add the ":hourglass: Long Term" label to prevent the stale action from closing this issue.

@github-actions github-actions bot added the ❄️ Stale This issue has been marked stale label Sep 21, 2024
@mathomp4
Copy link
Member Author

Closing this as, well, SCM works now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪲 Bugfix This fixes a bug! ❄️ Stale This issue has been marked stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants