-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addcodecov #196
Addcodecov #196
Conversation
I named the new options alt02, alt03, alt04, but we could come up with a more descriptive set of names. E.g. alt02 is very low resolution within the column (1 layer in the vertical, 1 category); alt03 is higher snow resolution (5 layers) but otherwise much simpler (and summer conditions so maybe this doesn't really test the snow resolution). We do need to look at all of our test combos to make sure they make sense. |
These changes look fine. Before approving, a couple thoughts. First, I agree something more descriptive than alt01, alt02, etc might be better. Second, we probably need to come up with some general rules about whether the test names must change when the test changes. So, lets say we are running alt01 and then we decide to change the forcing or anything else about the alt01 test. And that changes answers. Is it OK to continue to call it alt01 or does alt01 need to be retired and we create a new test. In some ways, we would want to do that for transparency as well as regression testing. If alt01 shows a regression fail when we expect it to, that's not particularly useful. It might be better to stop testing alt01 and start testing alt01a and the regression testing will then not show a fail. Finally, it sounds like there is still some tuning to be done with these tests to get the setup "best". Should that occur as part of this PR or as part of continue development? |
I can rename the new tests to something more meaningful, but let me check them first. I want to run the output through the plotting script and see whether it looks reasonable. If not, the options might change, which might affect what I call them. And yes, this tuning should be done as part of this PR. Re keeping test names when tests change: I'm ambivalent. It makes sense to change them when the output changes, but we'll need 2 things, a criterion for when a change is necessary and a naming convention. In this PR the option files themselves haven't changed, but the results change because of a change in the code. I don't think that should cause us to change the test, but in some cases it could. |
If the code changes and answers change, then that's what we're testing for and test names should not change. I believe even in the case where a test namelist also has to change. I think the only situation where a testname might change is if the test changes and the answers change as a result of that. |
I agree with that |
I looked at the plots for the new tests, and they all look reasonable. We do have a problem with oscillations in the surface temperature and sensible heat flux, most noticeable in alt03. All of the new tests have ocn_data_type = 'SHEBA' (convergence data to test ridging, see issue #149). That's a problem that won't get fixed in this PR. Suggestions for test names? |
I don't know what good names might be. If these are just random-ish mixes of options, then maybe alt01, alt02, ... are fine. If we can define some reasonable name for them, like thermoX or physicsY or something, then maybe we should use a name like that. Should we add these tests to the base_suite? |
@apcraig Icepack could benefit from the same testing organization that you are undertaking for CICE. In that case, we might want to abandon this PR in order to avoid the renaming issue, or we could merge it then change things, or we could take a middle route and rename the tests temporarily here, merge the PR, then name them using alt01, alt02 etc following the same pattern as in the CICE tests. Lots of options, we just need to keep moving forward. Preference? And yes, these tests should be added to the base_suite (or similar). |
I think we should keep this PR moving forward. It's too hard to keep the CICE and Icepack scripts synchronized at this point. If we decide we want to update the Icepack scripts with some of the features of the CICE scripts, we can do that, but lets make it a separate PR and maybe even create an issue first outlining what features are desired. I think the scripts are going to slowly diverge at this point, and it's going to be hard to prevent that. I will again throw out a suggestion that we consider merging Icepack back into CICE, bringing the Icepack driver with it, making the Icepack driver a separate configuration within CICE, and then having a single set of scripts. There is cost associated with maintaining two separate repositories, both with regard to having two scripts packages and also the git submodule stuff. Maybe we can revisit this in the future, I think now is too soon. I would like to get some feedback from E3SM on the Icepack separation, new interfaces, Icepack scripts, and so forth before we do anything. If E3SM is keeping their own copy of column physics and has little interest in regularly contributing/merging/testing/etc with the tools/repo provided by the consortium, then maybe the consortium, in a year or two, can think about what works best for the consortium. |
Ok, let's merge this one now, then add the new tests to the base suite and check code coverage, and go from there. |
Are you able to add these tests to the base_suite before we merge the PR? If you need some help, let me know. You should be able to just add one line per test to the base_suite.ts file. You can even rearrange or remove some other tests if it makes sense with respect to coverage. |
I was thinking we'd merge this one without changing the base_suite so that the automated testing would work, then change base_suite separately, which presumably wouldn't pass but would be an isolated change. But I'm not sure what the automated tests are checking at the moment, so maybe this doesn't matter. |
the automated testing is testing the travis_suite, so we're not changing that. Also, if we do automated testing outside travis, which we aren't quite yet, adding new tests is ok. if we do regression testing, it just will show there wasn't a baseline. |
I added alt02, alt03, alt04 to the base_suite, both for smoke and restart tests. The new tests pass except that 03 and 04 fail restarts, so I've commented those out for now. The purpose of this PR is to get the tests set up; a separate step will be debugging the new restart tests. I fixed the initial condition, which had a hardwired category-3 value that created problems for 1-category tests. |
Remove Tracer CPPs, improve scripts test performance, and update documentation
Bug fixes and extended test combinations for better code coverage
Developer(s): E. Hunke
Please suggest code Pull Request reviewers in the column at right.
Are the code changes bit for bit, different at roundoff level, or more substantial?
Bug fix causes tests with kcatbound /= 1 to be different. The BGC tests are also showing different data but I think that's the script problem we've been discussing in PR #190
Is the documentation being updated with this PR? (Y/N) no
If not, does the documentation need to be updated separately at a later time? (Y/N) yes, if we keep these tests
Note: "Documentation" includes information on the wiki and .rst files in doc/source/,
which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-icepack/.
Other Relevant Details:
Bug fix 1: the Icepack driver was not reading grid_nml so was not resetting kcatbound from the default, which is 1 in the code. I reset the default in icepack_in to be 1 so that the current tests would not all fail.
Bug fix 2: as noted in #194, there was a bug in icepack_atmo.F90 ('or' not 'and'). However the quantity being calculated was never used so I fixed the bug by removing those lines completely.
Added several new test combinations which capture most of the code coverage issues in #194. These run but the output needs to be looked at carefully to see if they are doing what we expect them to do.