Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Problems with multirun option in GCHP 13 #136

Closed
chen-yuy opened this issue Aug 25, 2021 · 10 comments
Closed

[QUESTION] Problems with multirun option in GCHP 13 #136

chen-yuy opened this issue Aug 25, 2021 · 10 comments
Labels
category: Question Further information is requested

Comments

@chen-yuy
Copy link

Hello,

I am working on a multirun test in GCHP (V13.0) to fix my problem similar to https://github.com/geoschem/geos-chem/issues/566#issue-759764207, my similar problem came out because I turn on the METEOROLOGY of the main switches of HEMCO_Config.rc file, and I still don't know why (this is my single run log file)
gchp.log.201901_met.txt

Besides, inspired by that discussion, I tried the multirun of GCHP. Here is my new issue, I couldn't create cap_restart file after first run and I don't know why, the followings are my log files of multirun
run_2205236.out.txt
run_2205236.err.txt
and run script gchp.multirun.run.txt.

Please tell me if you need other files. Many thanks!

@chen-yuy chen-yuy added the category: Question Further information is requested label Aug 25, 2021
@yantosca yantosca changed the title [QUESTION] [QUESTION] Problems with multirun option in GCHP 13 Aug 25, 2021
@LiamBindle
Copy link
Contributor

LiamBindle commented Aug 26, 2021

Hi @chen-yuy, thanks for opening this.

1. The cause of your crash

The root error in your simulation with METEOROLOGY is happening here. It looks like the issue is an unresolved CONV_DEPTH import.

IIUC the METEOROLOGY switch is for FlexGrid. FlexGrid is a GC-Classic only feature. This switch should be false in GCHP. Does switching it back to false fix the issue? If you're looking for a nested-like feature, the stretched-grid capability of GCHP is what you're looking for.


2. How to run multisegment simulations

Regarding the multirun option, that script is being retired. It is pretty easy to implement your own multisegment simulation though.

The set up you need to do is:

  1. Set Duration in runConfig.sh to the length of a segment (e.g., Duration="00000007 000000" for 7-day segments)
  2. In runConfig.sh set Periodic_Checkpoint=ON and Checkpoint_Freq equal to your segment duration.

At this point you should run ./runConfig.sh to write the config file updates. Run the first segment. From this point forward you should not execute ./runConfig.sh again as that will overwrite the manual config I'm going to describe next.

For subsequent segments (segment 2 and onwards) you should make the following updates before you start the simulation

  1. At the start of each segment, cap_restart should be the start time for the segment (e.g., 20190114 000000 for a segment starting on January 14th, 2019)
  2. At the start of each segment GCHPchem_INTERNAL_RESTART_FILE in GCHP.rc should be the appropriate gcchem*.nc4 file (the checkpoint file generated at the end of the last segment)

The "Auto-requeuing C360 simulation" is a good example of how to automate this. For segment 1 the CONTINUE_SEM file doesn't exist so runConfig.sh is executed. For subsequent segments the start date is written to cap_restart and the appropriate checkpoint file is written to GCHP.rc. Let me know if you need a hand setting this up. Hopefully we'll have a more automated solution in the future.

@chen-yuy
Copy link
Author

Hello, @LiamBindle Thanks for your reply.

I have found useful information about the METEOROLOGY switches by @yantosca. I highly recommend adding some information about this in HEMCO_Config.rc file to make those who used to use GC-Classic avoid this bug.

Also, in HEMCO_Config.rc, I realized that NON-EMISSIONS DATA part contains OLSON_LANDMAP and YUAN_MODIS_LAI switches which could not be set to TRUE. I didn't realize that until I encountered a crash. And there might be more similar switches were banned and I don't know either.
Maybe adding more info about HEMCO_Config.rc file will make GCHP easier to use (I don't know)

Now my non-multirun is working, so I will try the multirun later.

Many thanks!

@chen-yuy
Copy link
Author

Hello @LiamBindle

I have another question about HEMCO_Config.rc.
I found I couldn't set true for the RESTART FIELDS including GC_RESTART, STATE_PSC, and HEMCO_RESTART for my first run. But after creating some checkpoints, I want to use a restart file to continue my simulation with those RESTART FIELDS set to true, but this will also run into a crush. I feel quite confused about this. I also want to know should I set the RESTART FIELDS switches to true for the continuous run? Or if I set all RESTART FIELDS to false, will the continuous simulation read the previous
restart files' GC_RESTART or HEMCO_RESTART contents?

Many thanks,
Yuyang

@lizziel
Copy link
Contributor

lizziel commented Aug 27, 2021

Hi Yuyang, all GEOS-Chem and HEMCO restart variables are included in the GCHP checkpoint file. The restart section entries in HEMCO_Config.rc are set false by default in GCHP since they are not used.

I agree including information about what is relevant to GCHP versus GC-Classic in the HEMCO_Config.rc comments would be helpful to users. It's a simple update and we will do this for an upcoming version. Even more helpful will be splitting the HEMCO config file in two, with one file for masking/scaling info (used by both GCHP and GC-Classic) and the other for file I/O (GC-Classic only). This is a long-term goal of the support team.

@chen-yuy
Copy link
Author

Hi @lizziel, thank you for your replies.
Is that means whatever I do with the RESTART FIELDS section, my GCHP simulation will always read the checkpoint file including GC_RESTART or HEMCO_RESTART, right?

@lizziel
Copy link
Contributor

lizziel commented Aug 30, 2021

You should keep the RESTART FIELDS section of HEMCO_Config.rc set to false. To read in all restart variables you only need to specify the restart file to use in runConfig.sh. When that script gets executed at run-time it actually updates a field in config file GCHP.rc which is used by MAPL in GCHP to find the restart file to read.

If in doubt about whether you are reading the correct restart file, or if you want to check if any of the restart variables were missing upon read, you can view this information in the GCHP log file generated at run-time. Search for string Character Resource Parameter: GCHPchem_INTERNAL_RESTART_FILE. The line containing this string will show the restart filename that GCHP is reading. Below it may be a list of variables next to string Bootstrapping Variable. All of those variables were not found in the restart file.

For example, if you download and use one of the restart files we provide for GCHP and do not alter the default runConfig.sh then you will see something like this in your first GCHP run.

Character Resource Parameter: GCHPchem_INTERNAL_RESTART_FILE:+initial_GEOSChem_rst.c24_fullchem.nc
WARNING: use of '+' or '-' in the restart name 'initial_GEOSChem_rst.c24_fullchem.nc' allows bootstrapping!
Using parallel NetCDF for file: initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: ARCHV_DRY_TOTN in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: ARCHV_WET_TOTN in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: AREA in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: DEP_RESERVOIR in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: DRYPERIOD in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: GCCTROPP in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: GWET_PREV in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: IsorropBisulfate in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: IsorropHplusFine in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: IsorropNitrateFine in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: IsorropSulfate in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: LAI_PREVDAY in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: PARDF_DAVG in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: PARDR_DAVG in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: PFACTOR in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: STATE_PSC in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: T_DAVG in initial_GEOSChem_rst.c24_fullchem.nc
Bootstrapping Variable: T_PREVDAY in initial_GEOSChem_rst.c24_fullchem.nc

The variables listed above are the HEMCO restart variables. Like for GC-Classic, we do not provide HEMCO restart variables for your first GCHP run. However, your output restart file (checkpoint file) will contain these variables. You can verify by looking for these variables in your output file. You can also verify by using that restart file for a new run and checking the log. The message that these variables were bootstrapped should no longer be present since they were found in the file upon read.

I hope this clarifies things!

@chen-yuy
Copy link
Author

Hi, Lizzie,
Thanks for the detailed explanation, I fully understand now!
Thank you so much!

@lizziel
Copy link
Contributor

lizziel commented Aug 31, 2021

You're welcome!

@lizziel lizziel closed this as completed Aug 31, 2021
@mtsivlidou
Copy link

Hello,
I would like to ask one question about restarting a simulation without using the multirun option (for GCHP v13.2).
I started a run for 2 months (01Jan-01March) and I got 2 monthly time-averaged diagnostics and 2 checkpoints. In order to restart the run is cap_restart really necessary?
I was thinking to change: i) the starting date to March , and ii) the initial_restart_file to the last checkpoint of the previous run in the run script. Then source the runConfig etc like in the first run.

(I am confused because with the multirun option, there was a condition if cap_restart exists for example to update the dates automatically. But in my case I already know the new starting date and the restart file I want to use. )
Thank you
Maria

@chen-yuy
Copy link
Author

@mtsivlidou
In my opinion, if you don't use multirun option, the cap_restart is useless and should be deleted before the next run.
And if you wanna use the restart file for the following March simulation, you should change your INITIAL_RESTART path to your 01March checkpoint file(you should change the checkpoint name to include str'Restart' ,date), and also update your Start_Time and End_time

Hope this will help.

Best,
Yuyang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants