-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Initial Condition Handling to EAMxx DAG Generator #6860
Add Initial Condition Handling to EAMxx DAG Generator #6860
Conversation
ef142bc
to
c15b1e5
Compare
Looks like I can't add reviewers, so tagging those from original PR @bartgol @tcclevenger @jgfouca @mahf708 @AaronDonahue |
Note that the |
m_ad_status |= s_fields_created; | ||
|
||
// If the user requested it, we can save a dictionary of the FM fields to file | ||
auto& driver_options_pl = m_atm_params.sublist("driver_options"); | ||
// auto& driver_options_pl = m_atm_params.sublist("driver_options"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line can be deleted
@@ -9,6 +9,7 @@ CreateADUnitTest(${TEST_BASE_NAME} | |||
LABELS physics mam4_srf_online_emiss mam4_constituent_fluxes | |||
MPI_RANKS ${TEST_RANK_START} ${TEST_RANK_END} | |||
FIXTURES_SETUP_INDIVIDUAL ${FIXTURES_BASE_NAME} | |||
PROPERTIES RESOURCE_LOCK ${FIXTURES_BASE_NAME} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you serializing these tests? Is this b/c of the generated dag? Imho, we can just let them run in ||, even if they overwrite each other's dag...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one was unrelated to the dag, though kept crashing while I was debugging. Could be idiosyncratic to whichever machine I was on, this appeared to fix it. I'm also ok with reverting and keeping an eye out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was it only this test that failed? You could modify its input.yaml file to set the dag verbosity to 0 (or whatever turns it completely off). That is, assuming the dag was the issue...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bartgol --getting back to this after the break, and it appears that the failures I was seeing were due to some transient computational weirdness. Tests seem to be passing without the lock, but I'll run a few more tests on my end to confirm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I have one question on the test modification.
4f0a134
to
e37b47b
Compare
@mjs271 I think you may have rebased onto the eagle-project's version of master (or onto your local version of master), rather that onto e3sm-project's master. |
oof! I'll push a fix shortly |
e37b47b
to
8e6b5c5
Compare
…ating as such fixed minor bug--still an issue with 2 fields in p3 for pg2 cases print formatting fix for dag remove 2 intermediate DAGs add descriptor box for IC fields
8e6b5c5
to
651927e
Compare
I think this may be noted above, but to refresh:
|
Is this ready? |
Yes. But I want to rerun tests now that we merged the mam4xx fix, to verify we are no longer non-deterministic in those tests. |
The fails are unrelated. I'm merging. |
Finishes PR #3101 originally submitted to the scream repo the atmospheric DAG generator (atmosphere_process_dag.xpp) and atmosphere driver to get things working again. Namely,
There are now 2 checkpoints during AtmosphereDriver::initialize() at which we build the most complete DAG available and write it to file.
This is intended to serve as a diagnostic tool in case a build crashes, giving the user information about the created/initialized processes or fields.
These checkpoints occur at the end of the following functions, and the associated DAG is written out to a .dot file in the run directory:
I've added some code that enables the DAG to properly handle initial conditions and determine whether they are missing at the time when he DAG is generated, and if so this is indicated in the corresponding node (see images in original PR for examples).
In the initial DAG, (within initialize_atm_procs()), an extra box is added to the DAG with a disclaimer that any fields indicated as missing may be provided by the forthcoming initial condition file.
A node for the Initial Condition is now displayed on the DAG with edges leading to the "Begin of Atm Time Step" node adn any nodes for a process that uses and does not change a field from the initial condition.
I've made some minor formatting changes for readability and to more clearly indicate the different types of objects represented on the DAG--e.g., the Begin/End of Atm. Timestep nodes are colored differently, and fields are printed in green when provided directly by the initial condition at initialization.
I've also fenced-off the write_dag() statements to ensure they are only printed by the main MPI thread, since I noticed occasional formatting errors due to a presumptive race condition.
Additionally, when running as an eamxx standalone test, the filenames of the DAGs are tagged with
.np<N>
to avoid conflicts writing to file.Lastly, I added a
RESOURCE_LOCK
to the standalone testmam4_srf_online_emiss_mam4_constituent_fluxes
that appeared to be failing due to incorrect order of running the tests being compared.Note: Closes E3SM/scream Issue #1869.