-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUESTION] Multi-run simulations #198
Comments
Hi @lorenzocostantino, thanks the this question too. The primary objective of multi-segment runs is to break the simulation up into a series of consecutive jobs for your scheduler. Your script is looping over the days, so presumably this would be a single job that is submitted to your scheduler. In this case, there isn't a reason to run the simulation as a multi-segmented run—you would be better off turning periodic checkpointing on and setting the duration equal to the total length of your simulation (i.e., a single segment). Assuming you do actually want to run your simulation as a series of multi-segment runs, here are a few notes for your consideration:
With these two points in mind, you might consider a configuration like this:
and then a run script of #SCHEDULER_DIRECTIVES
#...
#SCHEDULER_DIRECTIVES
function last_checkpoint() {
ls -1 gcchem_internal_checkpoint*.nc4 | tail -n 1
}
function last_checkpoint_date() {
last_checkpoint | sed 's/gcchem_internal_checkpoint.\([12][0-9][0-9][0-9][0-1][0-9][0123][0-9]\).*/\1/'
}
# Configure starting/resuming the simulation
if ls -1 gcchem_internal_checkpoint*.nc4 &> /dev/null ; then
# no checkpoint file exists, therefore, initialize the start of a simulation
./runConfig.sh
RESTART_DATE=${Start_Time_Date}
else
# a timestamped checkpoint file exists, therefore, resume from the most recent one
RESTART_FILE=$(last_checkpoint)
RESTART_DATE=$(last_checkpoint_date)
echo "$RESTART_DATE 000000" > cap_restart
sed -i "s/GCHPchem_INTERNAL_RESTART_FILE: .*/GCHPchem_INTERNAL_RESTART_FILE: $RESTART_FILE/g" GCHP.rc
fi
mpirun -np $PAR_TOTAL_CORES --use-hwthread-cpus ./gchp &>> out.${RESTART_DATE}-segment.log This script will resume your simulation from the most recent The idea is that you would submit this job to your scheduler multiple times, and use job dependencies to get them to run one after the other (with LSF this is the Here are my responses to the other questions you had
I thought that overwriting
No, you can omit it. IIRC, the Here are some extra things for your consideration. For simulations like this, it's often easiest to test your simulation configuration at a low resolution like c24 or c48. Once your simualtion working good, then increase the resolution and resources. If you haven't already, it would be a good idea to consider writing a custom collection for Hope this is helpful. Let me know if you have any questions or if I've misunderstood anything. |
Hi @LiamBindle, thanks for your explanations and insights. Very clear and helpful.
and then GCHP overwrites the previous
otherwise, GC writes it at the beginning of the simulation and the checkpoint file will not include all following simulated hours.
(I didn't add my new collection into the runConfig script and everything seems to work fine, ... but I would like to be sure...) |
Hi @LiamBindle, thanks for your explanations and insights. Very clear and helpful.
and then GCHP overwrites the previous
otherwise, GC writes it at the beginning of the simulation and the checkpoint file will not include all following simulated hours.
(I didn't add my new collection into the runConfig script and everything seems to work fine, ... but I would like to be sure...) |
Sorry @lorenzocostantino, this fell through the cracks!
Edit: Regarding 4, @yantosca says that the PM10 diagnostic will be available in 13.4. |
Oh, I just noticed that #162 wasn't fixed in 13.3. This was a bug that caused the timestamp in the filename of diagnostic files to be incorrect (the timestamp in the filename was the wrong day, but the |
Hi @LiamBindle, |
I use GCHP 13.3.4 (at C180) and I want to launch multi-run simulations (ideally 365 daily runs).
I saw that you have already shown how to perform a multi-run simulation (e.g., #136 and others) but sometimes answers change as model versions evolve. I also checked the c360_requeuing.sh script. Still, it is not completely clear to me how to launch multiple chained runs with GCHP 13.3.4, as I would do something somewhat different from the c360_requeuing.sh
Let's say I have
I see that this model version automatically updates the cap_restart file at the end of each segment.
If Periodic_Checkpoint=OFF, at the end of each segment only the "gcchem_internal_checkpoint" file is written, overwriting the "gcchem_internal_checkpoint" file of the previous run.
If I am not wrong, we can use this file as GCHPchem_INTERNAL_RESTART_FILE for segment 2 and onward, updating GCHP.rc (and then re-launch gchp.sh) at line 70
with
GCHPchem_INTERNAL_RESTART_FILE: gcchem_internal_checkpoint
PS: should I use the "+" before the file name?
It is not completely clear to me what "+" does, a part from the following message in the std output:
WARNING: use of '+' or '-' in the restart name 'initial_GEOSChem_rst.c24_fullchem.nc' allows bootstrapping!
Within a script, I would do something like:
Is that correct ?
For coherence with other model outputs, I would also output daily files with hourly statistics (one file per day, with 24 times)
To do that, is that correct to set
?
Thank you in advance for your help.
The text was updated successfully, but these errors were encountered: