Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EAMxx: add p3 process rates as outputs #6938

Closed
wants to merge 1 commit into from

Conversation

mahf708
Copy link
Contributor

@mahf708 mahf708 commented Jan 24, 2025

adds p3 process rates as outputs controlled by a runtime flag.

fixes #6936

this impl is kind of hacky because
we don't yet have the facility to
request output fields adhoc
and these fields are buried deep
in the p3 impl; in the future,
this could be edited to be done
more smartly, but for now, this
may be just enough...
@mahf708 mahf708 added the EAMxx PRs focused on capabilities for EAMxx label Jan 24, 2025
@mahf708 mahf708 requested review from bartgol and jgfouca January 24, 2025 15:32
@mahf708 mahf708 changed the title EAMxx: add some extra p3 diags EAMxx: add p3 process rates as outputs Jan 24, 2025
Copy link

PR Preview Action v1.6.0

🚀 View preview at
https://E3SM-Project.github.io/E3SM/pr-preview/pr-6938/

Built to branch gh-pages at 2025-01-24 15:34 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@mahf708
Copy link
Contributor Author

mahf708 commented Jan 24, 2025

@hassanbeydoun + @brhillman, after we get a review from Luca and/or Jim (or other reviewers), you can test this with dpxx or short runs. In the meanwhile, let me know if I missed something that we might as well add while at it?

I expect Luca and Jim to request some edits, because this was a rushed impl. I don't have much time to make this super fancy or complicated, because the p3 source code is pretty convoluted. Actually, dealing with the hardcoded unit tests was probably the most challenging and wasted most of my time. I anticipate we will have to rewrite much of p3 and shoc in the future, because I don't think they're optimal (I'd argue both performance and code readability are much below what I'd think we should have)

I know Cornard is working on a way to make this type of work less painful (i.e., by making adhoc diagnostic requests through the field manager). Once that's ready, we can rewrite this impl. But I wanted to get this out because I know people are interested in scientific analysis/validation around these rates asap.

Comment on lines +372 to +379
// if not, let's just use the buffer for the unused? field
// TODO: check if this is actually okay and doesn't have uintended consequences
// if we are not outputing these fields, we really don't care about their values
// but would this have side effects or memory issues? idk
// TODO: maybe just use more buffer and assign stuff?
history_only.P3_qr2qv_evap = m_buffer.unused;
history_only.P3_qi2qv_sublim = m_buffer.unused;
history_only.P3_qc2qr_accret = m_buffer.unused;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may be stupid as hell, so we might need to add more to the buffer or figure out an alternative

Copy link
Member

@jgfouca jgfouca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine. If p3_extra_diags is off, we could just leave the views uninitialized. Then, in p3_main_part2, you could check if P3_qr2qv_evap is a "real" view (initialized and allocated) before doing the sets.

Copy link
Contributor

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Would you mind opening an issue regarding p3_extra_diags, saying what should be changed (e.g., avoid calc if diags are not requested)? It will likely fall into oblivion, but at least we'd have tried :)

@@ -227,6 +227,7 @@ be lost if SCREAM_HACK_XML is not enabled.
<set_cld_frac_l_to_one type="logical" doc="set P3 input liquid cloud fraction to 1 everywhere">false</set_cld_frac_l_to_one>
<set_cld_frac_r_to_one type="logical" doc="set P3 input rain cloud fraction to 1 everywhere" >false</set_cld_frac_r_to_one>
<set_cld_frac_i_to_one type="logical" doc="set P3 input ice cloud fraction to 1 everywhere" >false</set_cld_frac_i_to_one>
<p3_extra_diags type="logical" doc="Extra P3 diagnostics">false</p3_extra_diags>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bartgol is this enough for documentation or would you like something else addressed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are offering... I suppose you could be more verbose. Here, the description seems just a spell out of the short code name. Maybe something like "Enable calculation of additional diagnostic output from P3"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Would you mind opening an issue regarding p3_extra_diags, saying what should be changed (e.g., avoid calc if diags are not requested)? It will likely fall into oblivion, but at least we'd have tried :)

Ah, sorry, I misunderstood your comment. Well, the larger issue is the convoluted code, but if you and Jim are happy with how I did it, it is time to merge it and let the p3 warriors (not me) do their testing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just merge then. I am marking this as resolved. I think the "P3 diagnostics" will be something only select few will be using.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we have to fix a few fails first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooops!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  REQUIRE( isds_baseline[i].T_atm[k] == isds_cxx[i].T_atm[k] )
with expansion:
  267.12479f == 265.32697f

☠️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I ran the basic p3_tests and p3_sk_tests, they passed fine on pm-gpu (sp and dbg). I obviously didn't try the p3_tests_omp1 etc.

d.nevapr, d.qr_evap_tend, d.vap_liq_exchange, d.vap_ice_exchange, d.liq_ice_exchange, d.pratot,
d.nevapr, d.qr_evap_tend, d.vap_liq_exchange, d.vap_ice_exchange, d.liq_ice_exchange,
d.P3_qr2qv_evap, d.P3_qi2qv_sublim, d.P3_qc2qr_accret, d.P3_qc2qr_autoconv, d.P3_qv2qi_vapdep, d.P3_qc2qi_berg, d.P3_qc2qr_ice_shed, d.P3_qc2qi_collect, d.P3_qr2qi_collect, d.P3_qc2qi_hetero_freeze, d.P3_qr2qi_immers_freeze, d.P3_qi2qr_melt,
d.pratot,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error(s) happen(s) in the comparison a few lines below...

/home/runner/_work/E3SM/E3SM/components/eamxx/src/physics/p3/tests/p3_main_unit_tests.cpp:233: FAILED:
  REQUIRE( isds_baseline[i].T_atm[k] == isds_cxx[i].T_atm[k] )
with expansion:
  279.9509309491 == 283.0331820996

I am at a loss what might be causing this...

@@ -708,7 +710,7 @@ struct P3MainPart2Data : public PhysicsTestData

struct P3MainPart3Data : public PhysicsTestData
{
static constexpr size_t NUM_ARRAYS = 34;
static constexpr size_t NUM_ARRAYS = 47;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static constexpr size_t NUM_ARRAYS = 47;
static constexpr size_t NUM_ARRAYS = 34;

@mahf708
Copy link
Contributor Author

mahf708 commented Jan 28, 2025

@bartgol @jgfouca, I am not sure what might be causing the baseline DIFFs here. (I confirmed; all the DIFFs are baseline diffs; the MMF2 test is a different cookie, so forget it for now.)

Do you spot anything that can change answers somewhere? I am not sure. Could it be an indexing issue in the tests?

Note: the issue seems confined to the unit tests; i.e., the actual model runtime is unaffected. I will probably be inclined to re-bless/re-generate these unit tests instead of dwelling too much on it. I will bring it up in the dev call on tues

@mahf708
Copy link
Contributor Author

mahf708 commented Jan 28, 2025

After looking a little more at the problems this morning, I concluded the following:

  • the fixes will likely take quite a bit of my time
  • the fixes are outside the main eamxx runtime (in the unit tests and in the pam driver; both of which are outside my interest and scope of willingness to spend much time chasing issues)

I will therefore close this PR. People interested in these tendencies/diagnostics can simply run with this PR as a patch and the results should be trustworthy. Integrating this into the mainline code is a bit of a hassle ...

@mahf708 mahf708 closed this Jan 28, 2025
@mahf708 mahf708 deleted the mahf708/eamxx/p3-extra-diags branch January 28, 2025 15:59
@whannah1
Copy link
Contributor

  • the fixes are outside the main eamxx runtime (in the unit tests and in the pam driver; both of which are outside my interest and scope of willingness to spend much time chasing issues)

If it helps I can take ownership of the changes in PAM so you wouldn't need to worry about those tests.

@mahf708
Copy link
Contributor Author

mahf708 commented Jan 28, 2025

  • the fixes are outside the main eamxx runtime (in the unit tests and in the pam driver; both of which are outside my interest and scope of willingness to spend much time chasing issues)

If it helps I can take ownership of the changes in PAM so you wouldn't need to worry about those tests.

Too late; the MMF2 test is now part of eamxx-v1 testing (https://github.com/E3SM-Project/E3SM/actions/runs/12952656866).

In general, I am not sure if EAMxx developers should deal with the burden of maintaining the fragile PAM/MMF2 setup, but that's above my pay grade (if I can deal with it quickly, I will; if not, I call it quits). I just don't really have much time on my hands for these types of PRs. I gave it a go thinking it would take a few hours of work, but it ended up being a rabbit hole. I am calling it quits and leaving it up to people who actually want these things output. As I said, the issues are in the fragile (and hardcoded) unit tests and pam driver, so regular EAMxx cases/tests should work just fine with no additional work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EAMxx PRs focused on capabilities for EAMxx
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EAMxx: add p3 rates diags
4 participants