Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve missing file error handling for untemplated filenames #1377

Conversation

lizziel
Copy link
Contributor

@lizziel lizziel commented Feb 16, 2022

Description

This update adds an additional case for handling files not found in ExtData. If a file is not found and its filename does not include the template token % then an error will be thrown without additional testing.

Related Issue

geoschem/GCHP#176

Motivation and Context

This update solves the following problem reported by Sebastian Eastham (@sdeastham):

Usually, if ExtData comes across a file which it needs but cannot find it fails with an intuitive error message. However, it appears that there is an edge case which still produces a confusing error. The TransportTracers simulation requires the file EDGAR_v42_SF6_IPCC_2.generic.01x01.nc, which contains data for several years (full path: HcoDir/SF6/v2019-01/EDGAR_v42_SF6_IPCC_2.generic.01x01.nc). If the file is missing, ExtData gets very confused because it follows the logic branch for an entry covering multiple years, and which crucially assumes that there will be at least one file per year. This results in it trying to fill out the file template with successively earlier years, and testing to see if a file is present. For reasons which are not 100% clear to me, this results in a failure in ESMF rather than specifically in MAPL, and hence the confusing error.

How Has This Been Tested?

Not tested.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Trivial change (affects only documentation or cleanup)

Checklist:

  • I have tested this change with a run of GEOSgcm (if non-trivial)
  • I have added one of the required labels (0 diff, 0 diff trivial, 0 diff structural, non 0-diff)
  • I have updated the CHANGELOG.md accordingly following the style of Keep a Changelog

@lizziel lizziel added the 0 Diff The changes in this pull request have verified to be zero-diff with the target branch. label Feb 16, 2022
@lizziel lizziel requested a review from a team as a code owner February 16, 2022 20:44
@bena-nasa
Copy link
Collaborator

bena-nasa commented Feb 17, 2022

@lizziel I would do this check here in the block of code at line 2084:

 if (item%frequency == zero) then
           
           file = item%file
           Inquire(file=trim(file),EXIST=found)

If the file template has no token it by definition has zero frequency, so I would just assert here after the inquire that the file was found rather than where you did it.

@lizziel
Copy link
Contributor Author

lizziel commented Feb 17, 2022

@bena-nasa, if item%frequency means no file tokens then that is definitely a better place to put the handling. I assumed the frequency had to do with the refresh template frequency rather than the filename. The case this update is addressing is when there are multiple times in a single file so that frequency would be non-zero.

@bena-nasa
Copy link
Collaborator

bena-nasa commented Feb 17, 2022

@lizziel

@bena-nasa, if item%frequency means no file tokens then that is definitely a better place to put the handling. I assumed the frequency had to do with the refresh template frequency rather than the filename. The case this update is addressing is when there are multiple times in a single file so that frequency would be non-zero.

Actually Matt and I were looking at this and that is still the wrong place, if you look up above, this only protects if the field is 3d, 2d fields are not caught here. It actually should be here after line 1935 as if there are no tokens either the file exists or it doesn't and you can check here. Matt is testing this.

so quoting line 1929 on

              case("n2")
                 call ESMF_TimeSet(item%reff_time,yy=iyy,mm=imm,dd=idd,h=ihh,m=imn,s=0,__RC__)
                 call ESMF_TimeIntervalSet(item%frequency,startTime=start_time,m=1,__RC__)
              end select
           else
              ! couldn't find any tokens so all the data must be on one file
              call ESMF_TimeIntervalSet(item%frequency,__RC__)
             # you can check here, this is what I would add
             inquire(trim(item%file),exists=found)
             _ASSERT(found,"your message here!")

@bena-nasa
Copy link
Collaborator

@lizziel yes, matt just confirmed my previous suggestion worked. So that is where the missing file check should go in createTimeInterval, after line 1935:

              ! check if non-token file exists
              Inquire(file=trim(item%file),EXIST=found)
              _ASSERT(found,'File ' // trim(item%file) // ' not found'

@mathomp4
Copy link
Member

Actually Matt and I were looking at this and that is still the wrong place, if you look up above, this only protects if the field is 3d, 2d fields are not caught here. It actually should be here after line 1935 as if there are no tokens either the file exists or it doesn't and you can check here. Matt is testing this.

Yeah. I was going insane until Ben figured it out. I mean, I met all the logic (I thought) and yet the _ASSERT was never triggered. He finally realized only 3d files would ever get there. 😄

@lizziel
Copy link
Contributor Author

lizziel commented Feb 25, 2022

Great, thanks for looking at this so closely. I'll re-submit tomorrow.

@mathomp4
Copy link
Member

@lizziel Note that MAPL develop is currently undergoing some "maintenance". It was decided to revert some commits in develop and put them into the MAPL v3 work branch instead. So it is possible you might just have to "re-fork" or "re-baseline" your MAPL fork for this PR.

My hope is maybe you can do something simple on your side and all is well. You luckily are editing a file that I think is unaffected by our current work. (Well, the changelog will probably go conflict-nuts, but changelogs always do that.)

@lizziel lizziel force-pushed the feature/improve_missing_file_error_handling branch from 4a7eb24 to 563169f Compare February 25, 2022 16:12
@lizziel
Copy link
Contributor Author

lizziel commented Feb 25, 2022

@mathomp4 Thanks for the heads up on that. I force-pushed the accepted solution, but it is still on develop as of 2/16. Let me know when things settle and I'll rebase as needed.

@mathomp4
Copy link
Member

@mathomp4 Thanks for the heads up on that. I force-pushed the accepted solution, but it is still on develop as of 2/16. Let me know when things settle and I'll rebase as needed.

@lizziel Okay. All the develop and MAPL3 stuff is done. Try updating branch or whatever your github fork says to do (not sure myself).

@lizziel lizziel force-pushed the feature/improve_missing_file_error_handling branch from 563169f to 50ea94c Compare February 25, 2022 23:10
@lizziel
Copy link
Contributor Author

lizziel commented Feb 25, 2022

All set. Not sure if how I do it is the best way, but I just locally saved the old branch with a different name, moved this branch's HEAD way back, pulled in latest develop, cherry-picked the update, then force-pushed. I'm generally wary of force pushing but for short-lived branches used only for PRs I make an exception.

@lizziel
Copy link
Contributor Author

lizziel commented Feb 25, 2022

It's failing the CI during build. I'll take care of it.

@lizziel lizziel force-pushed the feature/improve_missing_file_error_handling branch from 50ea94c to 5adc011 Compare February 25, 2022 23:23
@lizziel
Copy link
Contributor Author

lizziel commented Feb 25, 2022

Should be all set. Could you trigger the CI?

@mathomp4
Copy link
Member

Should be all set. Could you trigger the CI?

@lizziel It seems to be working all fine right now. Most likely will pass!

@mathomp4
Copy link
Member

@lizziel Your PR found some bugs in GEOS and @bena-nasa might have a fix for your fix so that things are done more logically. We are working on a fix now.

While looking at GEOS-ESM#1377 by @lizziel, it was found that her check was a bit too powerful for GEOS in that we have some bugs in our ExtData files. But, @bena-nasa looked at the code and found that we were doing `CreateTimeInterval` at the wrong time. Previously, we were doing it during the phase of ExtData when the *entire* `ExtData.rc` file is processed. Instead, we move the call to the loop where ExtData works on the actual Imports it will be handling.
@mathomp4
Copy link
Member

@lizziel Okay. I have a PR into your branch here:

geoschem#18

Please test it and if it works for you, then you pull into your branch and then we get it here!

@tclune tclune self-requested a review February 28, 2022 18:23
tclune
tclune previously approved these changes Feb 28, 2022
Move call to CreateTimeInterval
@mathomp4
Copy link
Member

mathomp4 commented Mar 1, 2022

Thanks to @lizziel pulling in my PR, I think this is now safe. I'll do one more GEOS run and if so, we can pull it in.

@mathomp4 mathomp4 self-requested a review March 1, 2022 18:48
Copy link
Member

@mathomp4 mathomp4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This passes all my tests I think

@mathomp4 mathomp4 merged commit 20a9b3d into GEOS-ESM:develop Mar 1, 2022
@lizziel lizziel deleted the feature/improve_missing_file_error_handling branch March 1, 2022 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 Diff The changes in this pull request have verified to be zero-diff with the target branch.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants