Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Directory structure/naming clean-up #1150

Closed
agsalin opened this issue Feb 16, 2017 · 18 comments
Closed

Directory structure/naming clean-up #1150

agsalin opened this issue Feb 16, 2017 · 18 comments
Assignees

Comments

@agsalin
Copy link
Contributor

agsalin commented Feb 16, 2017

In trying to gain some fluency in CIME, I am finding instances where the directory structure and naming choices could use some clean-up. This is a small step towards decreasing the learning curve for new CIME developers.

Some ideas that feel like improvements to me:

  1. Move driver_cpl, externals, share, and maybe components underneath a src directory.
  2. Flatten director structure of: share/csm_share/shr ; remove csm_share layer perhaps. Reconsider if all externals really are external (I think the definition is: dropped in source code that others modify and so we don't). genf90 and CMake certainly don't fit here.
  3. Rename cime_config to be config or esm_configuration or model_configuration. cime/cime_config is redundant. Can have acme, cesm, and common sub-directories.
  4. Move the 4 buildlib scripts out of cime_config. I would expect them in the source code, not in configuration with XML lists.
  5. CIME/tests is nearly empty -- move unit tests into the src to remove top-level clutter.
  6. Remove utils from top level. No users need to see this, and is confused with tools, which they do need. Move code currently in utils/* to a level underneath where they are used in a /utils subdirectory. For example, move utils/python/CIME files to scripts/utils and scripts/Tools/utils.
  7. Get rid of capital letters in names of scripts/Testing and scripts/Tools -- no others are capitalized. scripts/Testing can be combined with utils/python/CIME/SytemTests to form scripts/testutils
  8. scripts_regression_tests needs to be very visible, such as in scripts directory itself.
  9. We need a file called something like acme_test_suite_definitions or acme_test_lists instead of this functionality being part of utils/python/update_acme_tests.py, and move to where it is visible so scientists can easily add tests.

Q1: Is there any agreement to make progress along these lines?

Q2: If so, is it best done incrementally or all at once in a bigger 6.0 release? Keeping scripts and tools the same preserves the user interface. so it may not be disruptive outside of the CIME team,

Q3: What is seq? (as in seq_comm_mct.F90 )

@mvertens
Copy link
Contributor

mvertens commented Feb 16, 2017 via email

@gold2718
Copy link

I like some of these suggestions and don't understand others.
I also find the upper and lower case stuff confusing as current usage seems random.

In general, I would not support a change unless you (or someone) could state a good engineering reason for what a good CIME directory structure should be. We have some experience now but I don't understand why your proposals are 'better'. Some random thoughts:

  1. Your discussion of 'what is an external' is a good start here.
  2. A naming convention of some sort would be helpful if it covers everything we currently know about.
  3. A philosophy of the structure would be a useful guide. For instance, should there be a single, top-level test directory or do 'test' directories exist in some other logical places?
  4. seq stands for sequential which is a legacy term. Consult an MCT historian for more information.

@gold2718
Copy link

BTW, this may not belong in this issue but another inconsistency that bothers me is xmlquery and xmlchange. I think they should be case.query and case.change to be consistent with all the other case tools.

@rljacob
Copy link
Member

rljacob commented Feb 16, 2017

I think "externals" just means "exists in its own git repo". genf90 and CMake do but maybe shouldn't. If no one is in charge of syncing the subtree with the main repo, then they should be absorbed in to CIME.

I think of "utils/python" as the true source of CIME. So I'd put that under src.

Someone, maybe @billsacks, had mentioned the need to rethink how everything under share is organized. We should do that as well as rename the embarrassing share/csm_share/shr. Unfortunately, that's the most disruptive to the rest of CESM so may have to wait.

@billsacks
Copy link
Member

@rljacob - you may be referring to #852, which refers to CESM-Development#18 and CESM-Development#96. Sean Santos gets the credit for that.

@gold2718
Copy link

  • I vote that we move the Split csm_share #852 discussion here as those issues are germane to this one. In particular, @jedwards4b's categories can be a guide to where the code ends up.
  • I like the idea of a top-level src directory (or source, we haven't had a vowel shortage in this country since Hawaii became a state).
  • I like having an externals directory for packages (e.g., MCT, PIO) which are maintained in their own repo and subtree merged into CIME.
  • I am struggling to understand the difference between a util and a tool. Anyone?

@rljacob
Copy link
Member

rljacob commented Feb 16, 2017

Lets also stick to directory structure. I'm going to open a different issue for a file rename.

@agsalin
Copy link
Contributor Author

agsalin commented Feb 16, 2017

My thinking is about a new person seeing cime. One thing I would hope to accomplish is to visibly split the fortran from infrastructure scripts from model configuration. When someone enters cime, they will probably know which of those three they are looking for. Seeing the current list of directories doesn't make this clear (e.g it isn't clear that share is source code at utils is infrastructure). The first step would be to move all Fortran source into a common directory (e.g., src or source).

The next priority is to make top-level directories the ones that new people / users look for, and to move developer-only code down a level. So, scripts & tools should remain visible, but utils can be hidden as scripts/utils or scripts/src.

Steve's question on definition of script vs tool vs util: my interpretation based on the contents is this: scripts are for running the model, tools are separate pre/post processing executables, and utils is source code for those.

tests: needs some thought of where to put this.

Rob: I think of "utils/python" as the true source of CIME. So I'd put that under src.

Yes it is, but for the above reasons, I don't think it goes in the same place as the Fortran or as a top-level directory. Maybe best as scripts/src.

@gold2718
Copy link

I don't think making the directory structure beginner friendly is a good engineering goal, that is what documentation is for. Rather, I would argue for a sustainable structure which allows for good maintenance, testing and code coverage practices to be implemented.
Also, I do not understand why utils makes a good synonym for 'tool source code'. In any case, most of our 'tools' (pre/post processing executables) are moving (or have moved) to python where the tool and the source are the same thing.

@rljacob
Copy link
Member

rljacob commented Feb 16, 2017

Since we ask users of various levels to navigate the CIME directory structure, it should make some intuitive sense so you don't have to constantly refer to the documentation while moving around. User friendly AND sustainable are not incompatible.

@billsacks
Copy link
Member

billsacks commented Feb 16, 2017

I don't think making the directory structure beginner friendly is a good engineering goal

To me, "beginner friendly" roughly equates with intuitive, which helps all of us

EDIT: @rljacob beat me to it

@agsalin
Copy link
Contributor Author

agsalin commented Feb 16, 2017

Regarding csm_share split-up. For my cmake implementation, I teased out the dependencies so that libraries can be built in order. csm_share/shr files are split into 3 libraries. (1) Stuff that is built first with no dependencies, (2) two files that need to be built after mct and esmf but before driver_cpl/shr, and (3) then the rest that needs to come after driver_cpl/shr.

I test by doing fresh builds with make -j 50 and not getting errors of mod files not existing.

I would be happy to work with someone to work on the organization and naming, since my grouping was purely on "use xxx_mod" dependencies, if it is agreed to re-organize by directory dependencies.

@gold2718
Copy link

gold2718 commented Feb 16, 2017

To me, "beginner friendly" roughly equates with intuitive, which helps all of us

I have yet to see a software system which is intuitive to all users and/or developers. However, I have also yet to meet anyone who thinks the current CIME directory structure is intuitive :)
I think @rljacob's point is important that we should be asking users for a minimal amount of CIME directory navigation (e.g., <root>/scripts is good, <root>/share/csm_share/test/old_unit_testers is not so good).
And why do we have old-doc and old_doc?

@mvertens
Copy link
Contributor

mvertens commented Feb 16, 2017 via email

@gold2718
Copy link

@agsalin, while your testing method is not bullet proof, your distinction between classes of files sound a lot like @jedwards4b distinction in #852. This suggests the beginning of a new structure:
source/case -- currently utils/CIME (need separate decision on location of test code)
source/utility_code -- stuff with no dependencies outside of this directory (your group 1)
source/library_support -- files that need to be built after mct, esmf, etc. (your group 2)
source/model_support -- files that need to be built after the driver (your group 3)
source/libraries/esmf_wrf_timemgr
source/libraries/shr_RandNum

@agsalin
Copy link
Contributor Author

agsalin commented Feb 17, 2017

@mvertens : Thanks. My timing is based on my attempts to navigate the code, and realizing that I will (hopefully) become familiar with it an no longer be useful as a fresh set of eyes. To be clear, I am not implying that CIME is particularly bad for a code that has had so much development and usage, just that this type of exercise is useful periodically for any code. I realize that this is bad timing for the CESM2 release but may be good timing for the documentation effort. I certainly agree that we can postpone acting on anything that is disruptive to the CESM release.

One way of going forward would be to create github issues for each independent decision, where we can come to consensus (or not) and decide whether to implement it now or to wait. I'd open ones for (1) grouping 4 fortran dirs under a single src subdir, (2) moving utils/ contents, (3) Is cime/cime_config the right name? (4) re-org of fortran dirs: csm_share, externals (with implementation postponed) -- e.g., @gold2718 comment just above.

@jedwards4b, @jgfouca : do we have your limited buy-in to pursue some clean-up along these lines?

@jedwards4b
Copy link
Contributor

I'm okay with it as long as it's done in small focused increments with a lot of testing (including full model tests) between each one.

@jgfouca
Copy link
Contributor

jgfouca commented Feb 17, 2017

I'm OK with it too.

agsalin added a commit that referenced this issue Feb 22, 2017
As discussed in issues #1178 and #1150, the
four model source directories are moved to
a CIME/src directory. This is intended to help
people navigate cime more easily.

The changes are mainly in xml config:
  $CIMEROOT/externals -> $CIMEROOT/src/externals
repeated for driver_cpl, share, components.

Also, many changes in python to add src to the
relative directory path.

scripts_regression_tests all pass
@rljacob rljacob closed this as completed Mar 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants