Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need for a new compset naming convention #4125

Closed
jedwards4b opened this issue Nov 10, 2021 · 16 comments · Fixed by #4148
Closed

Need for a new compset naming convention #4125

jedwards4b opened this issue Nov 10, 2021 · 16 comments · Fixed by #4148
Assignees

Comments

@jedwards4b
Copy link
Contributor

I would like to add a new component to cesm and thus to cime for the cmeps driver. This component would be optional and only initially needed for a few compsets. I do not want to repeat the experience of adding the IAC and ESP components which have created a bit of spaghetti code in this repository and in cmeps. See also ESCOMP/CMEPS#254

@rljacob
Copy link
Member

rljacob commented Nov 12, 2021

Instead of strict positional notation, we could add another key letter and let things be in any order.
so for example this long name:
1850_EAM%CMIP6_ELM%SPBC_CICE%PRES_DOCN%DOM_SROF_SGLC_SWAV
would be:
I%1850,L%ELM%SPBC,S%CICE%PRES,O%DOCN%DOM,A%EAM%CMIP6

using commas instead of underscore. The order doesn't matter. Anything not included but expected is assumed to be stub.

Still seems like there's a need for a minimum number of things to put on that line. At least an initial condition "I".

Not sure what to do with things like "_BGC" which was not a separate model but a modifier for other models (right?). If you want to turn on BGC, you instead have to modify each component setting: CAM%FOOBGC, MPASO%BARBGC.

@jedwards4b
Copy link
Contributor Author

This is pretty much what we are considering, but I don't like overloading the '%' delimiter

@rljacob
Copy link
Member

rljacob commented Nov 12, 2021

Sure some other character can be the delimiter.

@jedwards4b
Copy link
Contributor Author

Here is what I am considering (using the compset @rljacob chose):

1850_A-EAM%CMIP6_L-ELM%SPBC_I-CICE%PRES_O-DOCN%DOM

I don't see any reason for a modifier on the date field, but it must always be first, others can be in any order.

@mvertens
Copy link
Contributor

mvertens commented Jan 4, 2022

@jedwards4b - thanks. I would suggest replacing A- witha:. Dashes are used in the test name - but colons never are.
I would also suggest every entry have a prefix. So 1850 would be t: for time. If there are bgc entries across all components it should have a driver modifier.

t:1850_a:EAM%CMIP6_l:ELM%SPBC_i:CICE%PRES_o:DOCN%DOM_d:BGC

Alternatively, we could have a 3 letter modifier for clarity (which I am starting to think might be helpful)

tim:1850_atm:EAM%CMIP6_lnd:ELM%SPBC_ice:CICE%PRES_onc:DOCN%DOM_drv:BGC

Thoughts?

@rljacob
Copy link
Member

rljacob commented Jan 4, 2022

I'd still like to get rid of special case-wide modifiers like "BGC". If you want to turn on bgc in several components, you need to modify what is after the "%" in each component. That makes it more clear which ones actually have BGC.

@mvertens
Copy link
Contributor

mvertens commented Jan 4, 2022

@rljacob - I see your point. But what if its a driver level setting that needs to be recognized by each component and that there would be problems if components do not set this consistently.

@mvertens
Copy link
Contributor

mvertens commented Jan 4, 2022

So the BGC setting currently dictates a whole set of fields and co2 settings that are transferred between components. I don't see how that could be done by each component specifying this separately.

@rljacob
Copy link
Member

rljacob commented Jan 4, 2022

Oh I didn't notice the "d:" which does help. But the pattern is broken. Maybe it should be "d:CPL7%BGC" or "d:CMEPS%BGC"?

@mvertens
Copy link
Contributor

mvertens commented Jan 4, 2022

Yes - that sounds good. @jedwards4b and I are leaning towards a 3 letter prefix for clarity. As we get more components 1 letter will start being ambiguous.

@rljacob
Copy link
Member

rljacob commented Jan 4, 2022

I agree we don't want to limit ourselves to 26 components.

@billsacks
Copy link
Member

I like the direction this is going.

In terms of a specific separator character: before settling on one, would it be worth trying to build and run a short test case with the proposed character in the case name? – since we have had problems with some characters in the past, e.g., in the build process. (Relevant given that people are now sometimes creating tests with compset long names, so the compset long name appears in the case name.)

@jedwards4b
Copy link
Contributor Author

I will test using both alias and long name. I am referring to this page for allowed characters.

@ekluzek
Copy link
Contributor

ekluzek commented Jan 4, 2022

Thanks so much for this @jedwards4b and @mvertens I agree I really like where this is going with the three digit lower case component names. This will make our matching on compset names so much more robust.

I added a suggestion that "tim" be thought of as "scenario" rather than "time" because it also covers things like SSP-RCP scenarios, Paleo, and aquaplanet.

See

#4148 (comment)

@gold2718
Copy link

gold2718 commented Jan 4, 2022

This seems like a good place to ask about something I have never understood. If a compset defines a set of components, why is the time / scenario required? This requirement then requires duplicated compsets in config_compsets.xml. CAM has a bunch of those.
So why is that part of the compset name instead of a separate configuration requirement (e.g., like the --res argument)?

@jedwards4b
Copy link
Contributor Author

I think that it's mostly historic, I think that #4148 is a step toward removing it from the compset name and making it an independent argument.

Repository owner moved this from Todo ~ weeks to Done in CESM: infrastructure / cross-component SE priorities Jan 7, 2022
jedwards4b added a commit that referenced this issue Jan 7, 2022
…ming

Introduce a new compset naming convention which maintains backward compatibility and allows position independence.
This convention uses a three character lower case prefix to indicate the component class (tim, atm, lnd, ice, ...) followed by a : . Some examples are:

    tim:2000_atm:DATM%NYF_ocn:DOCN%SOMAQP
    tim:1850_lnd:DLND%SCPL
    atm:DATM%NYF_ice:DICE%SSMI_ocn:DOCN%DOM_rof:DROF%NYF_tim%2000

Test suite: scripts_regression_tests, cesm prealpha tests
Test baseline:
Test namelist changes:
Test status: bit for bit

Fixes #4125

User interface changes?:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants