Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Preporcessing Scripts and a README to the sponges tools #96

Merged
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions tools/sponge/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
## Sponges

This folder contains tools for creating temperature and salinity sponges for use in MOM6 based on the GLORYS reanalysis.

## preproc_scripts

Scripts to subset, average, and fill the GLORYS data on the uda to preprocess it for the python scripts.

# Regional Subsetting

Regional subsets of the GLORYS reanalysis are created using `ncks` in `get_so_monthly.sh` and `get_thetao_monthly.sh`.
You should adjust the regional subset in the script to match the region of interest.

For example:
```
ncks -d latitude,40.,90. -d longitude,0.,360 filein.nc fileout.nc
```

# Scripts

As written these preprocessing scripts must be run in three stages.
1. First, subset the temperature and salinity and create monthly averages
```
sbatch get_thetao_monthly.sh <YEAR> <MONTH>
sbatch get_so_monthly.sh <YEAR> <MONTH>
```

2. Next, fill the data
```
sbatch fill_glorys_nn_monthly.sh <YEAR> <MONTH>
```

3. Finally, once the filled data for every month in a given yeas have been created, the merge script can be used.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to "every month in a given year has been created"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theresa-morrison, I think this PR is almost ready, except for a small comment from @uwagura that hasn’t been addressed yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I was waiting to see if more typos would be found before submitting a commit. I will take care of this!

```
sbatch merge_so_thetao_year.sh <YEAR>
```

This should produce data that is compatible with `write_nudging_data.py`

## Using these files in MOM6

To use the sponges generated by these scripts in MOM6 we reccomend the following settings:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOPS, It's my bad but I found another one: "recommend"......

```
#override SPONGE = True
#override SPONGE_UV = False
#override SPONGE_DAMPING_FILE = "damping_full_t_90d.nc"
#override SPONGE_IDAMP_VAR = "Idamp"
#override SPONGE_STATE_FILE = "glorys_sponge_monthly_bnd_${fyear}.nc"
#override SPONGE_PTEMP_VAR = "thetao"
#override SPONGE_SALT_VAR = "so"
#override SPONGE_ETA_VAR = "depth"
#override INTERPOLATE_SPONGE_TIME_SPACE = True
#override SPONGE_DATA_ONGRID = True
```
These should be added to `MOM_override` in the experiment of the xml.

In the xml, the paths to the files needed for sponges so that they are include in the `INPUT` directory.
```
<!-- Two new files for the nudging: -->
<dataFile label="input" target="INPUT/" chksum="" size="" timestamp="">
<dataSource site="ncrc">$(YOUR_PATH)/damping_full_t_90d.nc</dataSource>
</dataFile>
<dataFile label="input" target="INPUT/" chksum="" size="" timestamp="">
<dataSource site="ncrc">$(YOUR_PATH)/glorys_sponge_monthly_${fyear}.nc</dataSource>
</dataFile>
```

39 changes: 39 additions & 0 deletions tools/sponge/preproc_scripts/fill_glorys_nn_monthly.csh
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
#!/bin/tcsh
#SBATCH --ntasks=1
#SBATCH --job-name=fill_glorys_arctic
#SBATCH --time=2880
#SBATCH --partition=batch

# Usage: sbatch fill_glorys_nn_monthly.sh <YEAR> <MONTH>
# Original Author: Andrew Ross, modified by Theresa Morrison

module load cdo
module load nco/5.0.1
module load gcp

set year=$1
set month=`printf "%02d" $2`

set apath='/archive/Theresa.Morrison/datasets/glorys/GLOBAL_MULTIYEAR_PHY_001_030/monthly'

# Regionally-slice and convert daily to monthly GLORYS reanalysis on archive beforehand.

# dmget all of the files for this month from archive.
dmget ${apath}/so/GLORYS_so_arctic_${year}_${month}.nc ${apath}/thetao/GLORYS_thetao_arctic_${year}_${month}.nc
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrew-c-ross I think this combines the dmget. Nothing is being dmget in my testing, so I'm not sure if it is working.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update: not working, it is writing over my files as expected.


# copy from archive to vftmp for speed?
#gcp /archive/acr/datasets/glorys/GLOBAL_MULTIYEAR_PHY_001_030/daily/GLORYS_REANALYSIS_${year}-${month}-*.nc $TMPDIR
gcp /archive/tnm/datasets/glorys/GLOBAL_MULTIYEAR_PHY_001_030/monthly/so/GLORYS_so_arctic_${year}_${month}.nc $TMPDIR
gcp /archive/tnm/datasets/glorys/GLOBAL_MULTIYEAR_PHY_001_030/monthly/thetao/GLORYS_thetao_arctic_${year}_${month}.nc $TMPDIR

# create a directory to store the filled files.
mkdir $TMPDIR/filled

# look for all of the daily files.
# loop over them, using cdo setmisstonn to fill the missing data
# and then ncks to compress the resulting file.
find ${TMPDIR}/GLORYS_so_arctic_${year}_${month}.nc -type f -exec sh -c 'file="$1"; filename="${file##*/}"; cdo setmisstonn "$1" "${TMPDIR}/filled/${filename}"; ncks -4 -L 5 "${TMPDIR}/filled/${filename}" -O "${TMPDIR}/filled/${filename}"' find-sh {} \;
find ${TMPDIR}/GLORYS_thetao_arctic_${year}_${month}.nc -type f -exec sh -c 'file="$1"; filename="${file##*/}"; cdo setmisstonn "$1" "${TMPDIR}/filled/${filename}"; ncks -4 -L 5 "${TMPDIR}/filled/${filename}" -O "${TMPDIR}/filled/${filename}"' find-sh {} \;

# copy the filled data for this month to /work.
gcp $TMPDIR/filled/*.nc /work/tnm/datasets/glorys/GLOBAL_MULTIYEAR_PHY_001_030/monthly/filled
29 changes: 29 additions & 0 deletions tools/sponge/preproc_scripts/get_so_monthly.csh
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#!/bin/tcsh
#SBATCH --ntasks=1
#SBATCH --job-name=fill_glorys_arctic
#SBATCH --time=2880
#SBATCH --partition=batch

# Usage: sbatch get_so_monthly.csh <YEAR> <MONTH>

module load cdo
module load nco/5.0.1
module load gcp

set year=$1
set month=`printf "%02d" $2`
set var='so'
set apath='/archive/Theresa.Morrison/datasets/glorys/GLOBAL_MULTIYEAR_PHY_001_030/monthly/'${var}
# create a directory to store the filled files.
mkdir $TMPDIR/${var}_${year}_${month}

set day=1
foreach filename (/uda/Global_Ocean_Physics_Reanalysis/global/daily/${var}/${year}/${var}_mercatorglorys12v1_gl12_mean_${year}${month}*.nc)
echo $filename
set short_name=${var}'_arctic_'$day
ncks -d latitude,39.,91. --mk_rec_dmn time $filename $TMPDIR/${var}_${year}_${month}/${short_name}'_bd.nc'
cdo -setreftime,1993-01-01,00:00:00,1day $TMPDIR/${var}_${year}_${month}/${short_name}'_bd.nc' $TMPDIR/${var}_${year}_${month}/${short_name}'.nc'
set day = `expr $day + 1`
echo $day
end
ncra -O $TMPDIR/${var}_${year}_${month}/${var}_arctic_*.nc ${apath}/GLORYS_${var}_arctic_${year}_${month}.nc
29 changes: 29 additions & 0 deletions tools/sponge/preproc_scripts/get_thetao_monthly.csh
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#!/bin/tcsh
#SBATCH --ntasks=1
#SBATCH --job-name=fill_glorys_arctic
#SBATCH --time=2880
#SBATCH --partition=batch

# Usage: sbatch get_thetao_monthly.csh <YEAR> <MONTH>

module load cdo
module load nco/5.0.1
module load gcp

set year=$1
set month=`printf "%02d" $2`
set var='thetao'
set apath='/archive/Theresa.Morrison/datasets/glorys/GLOBAL_MULTIYEAR_PHY_001_030/monthly/'${var}
# create a directory to store the filled files.
mkdir $TMPDIR/${var}_${year}_${month}

set day=1
foreach filename (/uda/Global_Ocean_Physics_Reanalysis/global/daily/${var}/${year}/${var}_mercatorglorys12v1_gl12_mean_${year}${month}*.nc)
echo $filename
set short_name=${var}'_arctic_'$day
ncks -d latitude,39.,91. --mk_rec_dmn time $filename $TMPDIR/${var}_${year}_${month}/${short_name}'_bd.nc'
cdo -setreftime,1993-01-01,00:00:00,1day $TMPDIR/${var}_${year}_${month}/${short_name}'_bd.nc' $TMPDIR/${var}_${year}_${month}/${short_name}'.nc'
set day = `expr $day + 1`
echo $day
end
ncra -O $TMPDIR/${var}_${year}_${month}/${var}_arctic_*.nc ${apath}/GLORYS_${var}_arctic_${year}_${month}.nc
26 changes: 26 additions & 0 deletions tools/sponge/preproc_scripts/merge_so_thetao_year.csh
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/bin/tcsh
#SBATCH --ntasks=1
#SBATCH --job-name=fill_glorys_arctic
#SBATCH --time=2880
#SBATCH --partition=batch

# Usage: sbatch fill_glorys_nn_monthly.sh <YEAR> <MONTH>

module load cdo
module load nco/5.0.1
module load gcp

set year=$1

set wpath='/work/Theresa.Morrison/datasets/glorys/GLOBAL_MULTIYEAR_PHY_001_030/monthly/filled'

# Concatenate monthly averages into single file
ncrcat -O ${wpath}/GLORYS_thetao_arctic_${year}_*.nc ${wpath}/GLORYS_thetao_arctic_${year}.nc
ncrcat -O ${wpath}/GLORYS_so_arctic_${year}_*.nc ${wpath}/GLORYS_so_arctic_${year}.nc

# Copy salt file to name for final file
cp -f ${wpath}/GLORYS_so_arctic_${year}.nc ${wpath}/GLORYS_arctic_${year}.nc

# Append temperature data to renamed salinity data
ncks -A ${wpath}/GLORYS_thetao_arctic_${year}.nc ${wpath}/GLORYS_arctic_${year}.nc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theresa-morrison, sorry for being picky, but do you think it’s a good idea to reduce redundant file copies and minimize repetitive processing? We could try something like the following:

!/bin/tcsh
#SBATCH --ntasks=1
#SBATCH --job-name=fill_glorys_arctic
#SBATCH --time=2880
#SBATCH --partition=batch

# Usage: sbatch merge_so_thetao_year.csh <YEAR>

module load cdo
module load nco/5.0.1
module load gcp

set year = $1
set wpath = '/work/Theresa.Morrison/datasets/glorys/GLOBAL_MULTIYEAR_PHY_001_030/monthly/filled'

# Define the file variables for salinity and temperature
set so_file = "${wpath}/GLORYS_so_arctic_${year}.nc"
set thetao_file = "${wpath}/GLORYS_thetao_arctic_${year}.nc"
set final_file = "${wpath}/GLORYS_arctic_${year}.nc"

# Concatenate monthly averages into single files for salinity and temperature
foreach var (so thetao)
    ncrcat -O ${wpath}/GLORYS_${var}_arctic_${year}_*.nc ${wpath}/GLORYS_${var}_arctic_${year}.nc
end

# Append temperature data to salinity file directly without copying
ncks -A ${thetao_file} ${so_file}

# Rename the combined file to final name
mv -f ${so_file} ${final_file}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind, I appreciate the suggestions!