-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MPAS-Framework] Add a new name_in_output
registry attribute to variables
#5120
[MPAS-Framework] Add a new name_in_output
registry attribute to variables
#5120
Conversation
These new attributes will be used to allow different names in ouput than in the code. Currently, the output name must be unique, but this causes trouble in time mean output, particularly for coordinate variables.
This can be used to set a different name in output streams than the name in code or the unique name required for input.
See E3SM-Ocean-Discussion#23 for additional discussion. |
Follow-up work based on this PR will add |
@akturner, @cbegeman, and @mark-petersen, you are free to test further if you want but you may also approve based on your review of E3SM-Ocean-Discussion#23 @matthewhoffman, I'm mostly wanting to make sure this does no harm to MALI. If there are tests you would like me to run, let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving based on my testing and comments in E3SM-Ocean-Discussion#23.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving based on testing in E3SM-Ocean-Discussion#23.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xylar , I looked this over and the changes make sense based on the PR description. I don't feel the need for MALI-specific testing beyond the testing already done for Ocean and Sea Ice, which certainly include fields that don't use the new functionality.
I have, however, a couple questions that you may want to answer in the PR description:
- Presumably
name_in_output
only functions as expected for pure output variables. If you tried to read in any such field (either as input or restart), I'm guessing it would be unrecognized. Is there any way to check/avoid this issues? Or at the least, add an explanation of that in the PR description and possibly somewhere in the code if there is an appropriate place (I'm not sure where that would be, maybe where the new field attributes are added inmpas_field_types.inc
?). - The PR description says "Prior to this merge, the output name must be unique, but this causes trouble in time mean output, particularly for coordinate variables." I'm confused by this statement. Presumably prior to this merge the output name is just the name in the code write? And can you explain why the old behavior caused trouble in time mean output?
- What happens if two different fields are given identical
name_in_output
values?
Thank you @matthewhoffman. I really appreciate your review and your helpful comments.
If you have a good way to detect this, I would be happy to add some code. I don't feel familiar enough with the different types of streams to know if there is an easy way to differentiate pure output streams from i/o streams like restart files. I guess I think a check is largely overkill because time-series-stats seems like the only place this functionality is likely to be used, and that is always output-only.
I will add something to the PR description for sure. I agree, there's probably not a better place in the code than
Coordinate variables are usually variables with the exact same name as their dimension. For example, we plan to add a Similarly,
I don't know what would happen. My guess is that the second variable would overwrite the first. But this almost certainly would not be something someone should try to do on purpose. The idea in giving different variables the same output name is that they would be written out in different contexts. Again, I'm not sure it's worth putting too much effort into protecting users from misusing this functionality. But I can if it's important to you. |
! outputFieldName and outputConstituentNames are alternative names for the field or its constituents for use in | ||
! output files. These should only be given names different from fieldName or constituentNames for pure output | ||
! variables such as diagnostic output and never for variables present in initial conditions or restart files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matthewhoffman, are you good with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @xylar . This is helpful. I would prefer to also see more info from the discussion above, specifically a reference to the specific use case this was implemented for (I think a specific example helps communicate the functionality), as well as the warning about possible collisions if multiple variables had the same name_in_output
in the same output stream. I agree this situation seems too difficult to try to protect against misuse, but I think it's important to document clearly what constitutes misuse.
One other clarification for me - this new attribute is accessible through Registry, right? But in this PR you are accessing it programmatically, right? I was confused by that initially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a 3 line comment here is already pushing the standard practice. I hesitate to put a lengthy discussion in a file folks are unlikely to look at in detail. I don't want to set the president that this is the "right" place to document things in detail just because we lack proper documentation. If we expand this practice, this file will get out of hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR doesn't directly make use of the new functionality either in code or in the register. It just makes sure neither breaks anything. Future, component specific PRs will actually use the functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If helpful, I might be able to find a little time later this week to have a video chat about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xylar , you're right that we don't have a good mechanism for documenting this sort of information. I've always thought framework should have been better documented, and I'm not opposed to breaking standard practice going forward. But I'm also ok with this PR serving as the documentation if you prefer to leave things as you have them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @matthewhoffman. I fully agree that better framework documentation is needed in general and that this PR isn't a great, easy to find alternative to proper documentation. Maybe we need a future E3SM hackathon on documenting the framework?
I looked over the comment here and your request again. I still feel like a longer description here would just get lost. Until there is a better place for it, I think this PR will have to suffice for documenting the purpose of this new functionality. Hopefully, the follow-up PRs to add Time
and Time_bnds
coordinates will help to demonstrate the capability by example more clearly.
Thank you @mark-petersen, @cbegeman, @akturner and @matthewhoffman for taking the time to review this. @jonbob, I believe this is ready for you to integrate when it's a good time. |
… next (PR #5120) MPAS-Framework: Add a new name_in_output registry attribute to variables This change to the MPAS-Framework can be used to set a different name in output streams than the name in code or the unique name required for input. This merge also adds corresponding outputFieldName and outputConstituentNames to each field. These new attributes will be used to allow different names in output than in the code. The name_in_output or the outputFieldName attribute must only be used for pure output variables. Since it doesn't change the name read from input files, renaming variables in an output file that also gets used as an input files (such as a restart file or initial condition) would cause unexpected results and should be avoided. [BFB]
Test merge passes:
merged to next |
merged to master |
This can be used to set a different name in output streams than the name in code or the unique name required for input.
This merge also adds corresponding
outputFieldName
andoutputConstituentNames
to each field. These new attributes will be used to allow different names in output than in the code.The
name_in_output
or theoutputFieldName
attribute must only be used for pure output variables. Since it doesn't change the name read from input files, renaming variables in an output file that also gets used as an input files (such as a restart file or initial condition) would cause unexpected results and should be avoided.Prior to this merge, the output name must be unique, but this causes trouble in time mean output, particularly for coordinate variables. Coordinate variables are usually variables with the exact same name as their dimension. For example, we plan to add a CF-compliant
Time
coordinate to supplementxtime
. TheTime
would becometimeMonthly_avg_Time
in the current version of the time-series-stats monthly average analysis member. But this is no longer a proper (or convenient) coordinate variable. It still needs to be calledTime
in the monthly output files. That is, it needs to have the same name as another variable in the code on purpose.Similarly,
temperature
becomestimeMonthly_avg_activeTracers_temperature
in time-series-stats monthly average. But with these changes, this one could be renamed to justtemperature
in the output, which would be a lot easier to work with. That kind of simplification is not yet planned but could be a nice medium-term project.[BFB]