Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Landuse data tool #1107

Closed
wants to merge 63 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
d93b785
outline landuse-pft data tool
glemieux Aug 3, 2023
bf5799e
Add test directory and first test file
glemieux Aug 3, 2023
3070aca
Move test notes over to test file
glemieux Aug 3, 2023
7536d76
Merge tag 'sci.1.67.2_api.27.0.0' into tools/landuse_pft-static
glemieux Aug 22, 2023
648bf9f
initial pytest and landuse tool package setup
glemieux Sep 29, 2023
8551229
positive path unit test for static file
glemieux Sep 30, 2023
bc3d9f0
add readme to resources directory
glemieux Oct 1, 2023
9315b84
add landusepft file import test
glemieux Oct 1, 2023
ff66df8
adding more functionality and accompanying unit tests to the landuse …
glemieux Oct 3, 2023
e4a3178
update static luh2 file import and add negative case test
glemieux Oct 3, 2023
15e31bf
add TypeError and tests to both import functions
glemieux Oct 3, 2023
ac1ba2a
Add additional landuse tool tests
glemieux Oct 4, 2023
16f4ec3
update the land use code to case the data to doubel precision
glemieux Oct 4, 2023
0e66b77
added tolerance to the normalization check
glemieux Oct 5, 2023
2946c50
add mock mask input to renorm test and added comments
glemieux Oct 5, 2023
9c4697a
minor comment update
glemieux Oct 5, 2023
6b71dec
Rename landusepft to landusepftmod
glemieux Oct 5, 2023
d5ce4f5
add main landuse script
glemieux Oct 7, 2023
239db4a
Updating main landusepft script
glemieux Oct 10, 2023
6f3dee5
Updating main landuse x pft script
glemieux Oct 11, 2023
3a13c7d
rename pct_ to frac_
glemieux Oct 12, 2023
6a1a71d
add rename of the dimensions for fates landuse pft tool
glemieux Oct 16, 2023
3091854
update renorm landuse function to not use mask
glemieux Oct 26, 2023
f2c0b63
Merge tag 'sci.1.69.0_api.31.0.0' into tools/landuse_pft-static
glemieux Nov 22, 2023
8686e14
update to remove need for mask in renormalization
glemieux Nov 29, 2023
596dd50
flip naming convention for tests to simplify tab completion
glemieux Jan 5, 2024
fe0053d
rename landusepft to landusedata
glemieux Jan 5, 2024
93d8029
move shell scripts to example directory
glemieux Jan 5, 2024
d15165c
make sure tests are using the new package name
glemieux Jan 5, 2024
a25a127
move luh2 file import test into luh2 test module
glemieux Jan 5, 2024
b959e53
start moving all landuse code into a one main CLI
glemieux Jan 5, 2024
9e5e097
start adding tests for the main CLI
glemieux Jan 5, 2024
ce42be5
added negative case for cli choice input
glemieux Jan 5, 2024
dab228d
setting up the logic to call luh2 or lupft
glemieux Jan 6, 2024
8aebc2a
switch main program from choice option to use subcommands
glemieux Jan 9, 2024
e4d350a
transfer the luh2 argument options to landusedata tool main
glemieux Jan 9, 2024
fb162ad
convert some luh2 arguments from optional to positional
glemieux Jan 9, 2024
5f63f51
Pass arguments from landusedata main to luh2 main
glemieux Jan 9, 2024
e2b29a8
pass landusedata main arguments to landusepft main
glemieux Jan 9, 2024
222135d
remove temporary print test function
glemieux Jan 9, 2024
7b732da
remove shebang and out of date script usage comments
glemieux Jan 9, 2024
a6453b8
start refactor of import methods
glemieux Jan 9, 2024
a6a8e5d
refactor landuse x pft regridding loop
glemieux Jan 9, 2024
4e11e9c
minor comment updates
glemieux Jan 9, 2024
827ee7d
add utility module and first utils test
glemieux Jan 9, 2024
8104040
add lat/lon coordinate add to target file prep function
glemieux Jan 9, 2024
a774d83
update both luh2 and lupft modules to both use common regrid target prep
glemieux Jan 10, 2024
4688269
simplify import of target regrid file in luh2
glemieux Jan 10, 2024
5a696b6
combine import and prep of regrid target into one function
glemieux Jan 10, 2024
bdc707a
move regrid functions into its own module
glemieux Jan 10, 2024
b07cb63
update luh2 subcommand to create full luh2 dataset in one command
glemieux Jan 11, 2024
24c1d52
remove old juypter notebook checkpoints
glemieux Jan 11, 2024
630db1a
correct indent alignment for luh2 regrid loop
glemieux Jan 12, 2024
58c7f6c
add missing time truncation arguments
glemieux Jan 13, 2024
0fee760
update landusepft masking calculation
glemieux Jan 17, 2024
0724112
move static luh2 mask function into common utils module
glemieux Jan 18, 2024
89dd5fe
move regrid target masking function to utils module
glemieux Jan 18, 2024
ac85fc1
minor comment removal
glemieux Jan 18, 2024
da022e1
make sure both subcommands use the same static luh2 mask
glemieux Jan 18, 2024
e4940cb
move luh2 static file import function to common utils module
glemieux Jan 18, 2024
70837f9
update mask application for lupft subcommand
glemieux Jan 18, 2024
3471b0d
remove outdated examples
glemieux Feb 7, 2024
26ba074
move readme up to top of project directory
glemieux Feb 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions tools/landusedata/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# FATES LUH2 data tool README

## Purpose

This tool takes the raw Land Use Harmonization (https://luh.umd.edu/), or LUH2, data files as
input and prepares them for use with FATES. The tool concatenates the various raw data sets into
a single file and provides the ability to regrid the source data resolution to a target
resolution that the user designates. The output data is then usable by FATES, mediated through
a host land model (currently either CTSM or E3SM).

For more information on how FATES utilizes this information see https://github.com/NGEET/fates/pull/1040.

## Installation

This tool requires the usage of conda with python3. See https://docs.conda.io/en/latest/miniconda.html#installing
for information on installing conda on your system. To install the conda environment necessary to run the tool
execute the following commands:

conda env create -f conda-luh2.yml

This will create a conda environment named "luh2". To activate this environment run:

conda activate luh2

For more information on creating conda environments see
https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-from-an-environment-yml-file

Note that it is planned that a subset of host land model (hlm) and hlm supported machines will incoporate this tool into the surface dataset workflow.
As such, if you are working on one of these machines, the output from this tool may be precomputed and available for the grid resolution of interest.

## Usage

After activating the "luh2" environment the tool can be run from the command line with the following minimum required inputs:

python luh2.py -l <raw-luh2-datafile> -s <luh2-static-datafile> -r <regrid-targetfile> -w <regridder-output> -o <outputfile>

The description of the minimum required input arguments is as follows:
- raw-luh2-datafile: this is one of three raw luh2 datafiles, either states, transitions, or management. This is the data to be regridded and used by FATES.
- luh2-static-datafile: supplementary 0.25 deg resolution static data used in the construction of the raw luh2 datafiles. This is utilized to help set the gridcell mask for the output file.
- regrid-targetfile: host land model surface data file intended to be used in conjunction with the fates run at a specific grid resolution. This is used as the regridder target resolution.
- regridder-output: the path and filename to write out the regridding weights file or to use an existing regridding weights file.
- outputfile: the path and filename to which the output is written

The tool is intended to be run three times, sequentially, to concatenate the raw states, transitions, and management data into a single file. After the first run of
the tool, a merge option should also be included in the argument list pointing to the most recent output file. This will ensure that the previous regridding run
will be merged into the current run as well as reusing the previously output regridding weights file (to help reduce duplicate computation).
The luh2.sh file in this directory provides an example shell script in using the python tool in this sequential manner. The python tool itself provides additional
help by passing the `--help` option argument to the command line call.

## Description of directory contents

- luh2.py: main luh2 python script
- luh2mod.py: python module source file for the functions called in luh2.py
- luh2.sh: example bash shell script file demonstrating how to call luh2.py
- conda-luh2.yml: conda enviroment yaml file which defines the minimum set of package dependencies for luh2.py
20 changes: 20 additions & 0 deletions tools/landusedata/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "landusedata"
version = "0.0.0"

[tool.setuptools]
package-dir = {"" = "src"}

[tool.setuptools.packages.find]
where = ["src"]

# [tool.pytest.ini_options]
# addopts = "-ra -q"
# testpaths = [
# "tests",
# "integration",
# ]
4 changes: 4 additions & 0 deletions tools/landusedata/pytest.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[pytest]
testpaths = tests
addopts = --verbose --durations=10 --color=yes
# addopts = --pep8 --flakes --verbose --durations=10 --color=yes
Empty file.
5 changes: 5 additions & 0 deletions tools/landusedata/src/landusedata/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from landusedata._main import main

# Gaurd against import time side effects
if __name__ == '__main__':
raise SystemExit(main())
80 changes: 80 additions & 0 deletions tools/landusedata/src/landusedata/_main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
import argparse

from landusedata.luh2 import main as luh2main
from landusedata.landusepft import main as lupftmain

def main(argv=None):

# Define top level parser
parser = argparse.ArgumentParser(description="FATES landuse data tool")

# Target regrid file - is there a nice way to share this between subparsers?
# parser.add_argument('regrid_target_file', help='target surface data file with desired grid resolution')

# Define subparser option for luh2 or landuse x pft data tool subcommands
subparsers = parser.add_subparsers(required=True, title='subcommands',
help='landuse data tool subcommand options')
luh2_parser = subparsers.add_parser('luh2', prog='luh2',
help='generate landuse harmonization timeseries data output')
lupft_parser = subparsers.add_parser('lupft', prog='lupft',
help='generate landuse x pft static data map output')

# Set the default called function for the subparser command
luh2_parser.set_defaults(func=luh2main)
lupft_parser.set_defaults(func=lupftmain)

# LUH2 subparser arguments
luh2_parser.add_argument('regrid_target_file',
help='target surface data file with desired grid resolution')
luh2_parser.add_argument("luh2_static_file",
help = "luh2 static data file")
luh2_parser.add_argument('luh2_states_file',
help = "full path of luh2 raw states file")
luh2_parser.add_argument('luh2_transitions_file',
help = "full path of luh2 raw transitions file")
luh2_parser.add_argument('luh2_management_file',
help = "full path of luh2 raw management file")
luh2_parser.add_argument("-w", "--regridder_weights",
default = 'regridder.nc',
help = "filename of regridder weights to save")
luh2_parser.add_argument("-b","--begin",
type = int,
default = None,
help = "beginning of date range of interest")
luh2_parser.add_argument("-e","--end",
type = int,
default = None,
help = "ending of date range to slice")
luh2_parser.add_argument("-o","--output",
default = 'LUH2_timeseries.nc',
help = "output filename")

# Landuse x pft subparser arguments
lupft_parser.add_argument('regrid_target_file',
help='target surface data file with desired grid resolution')
lupft_parser.add_argument('luh2_static_file',
help = "luh2 static data file")
lupft_parser.add_argument('clm_luhforest_file',
help = "CLM5_current_luhforest_deg025.nc")
lupft_parser.add_argument('clm_luhpasture_file',
help = "CLM5_current_luhpasture_deg025.nc")
lupft_parser.add_argument('clm_luhother_file',
help = "CLM5_current_luhother_deg025.nc")
lupft_parser.add_argument('clm_surface_file',
help = "CLM5_current_surf_deg025.nc")
lupft_parser.add_argument("-o","--output",
default = 'fates_landuse_pft_map.nc',
help = "output filename")

# Parse the arguments
args = parser.parse_args(argv)

# Call the default function for the given subcommand
args.func(args)

# Return successful completion
return 0

# Gaurd against import time side effects
if __name__ == '__main__':
raise SystemExit(main())
11 changes: 11 additions & 0 deletions tools/landusedata/src/landusedata/conda-luh2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# This yaml file is intended for users who wish to utilize the luh2.py tool on their own machines.
# The file is not yet tested regularly to determine if the latest versions of the dependencies will
# always work. This regular testing is expected to be implemented in the future.
name: luh2
channels:
- conda-forge
- defaults
dependencies:
- xesmf
# xarray which is autodownloaded as xesmf dependency, uses scipy, which needs netcdf4 to open datasets
- netcdf4
96 changes: 96 additions & 0 deletions tools/landusedata/src/landusedata/landusepft.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
import argparse, os, sys
import xarray as xr
import xesmf as xe

from landusedata.landusepftmod import ImportLandusePFTFile, AddLatLonCoordinates, RenormalizePFTs
from landusedata.utils import ImportLUH2StaticFile, ImportRegridTarget
from landusedata.utils import SetMaskRegridTarget, DefineStaticMask

def main(args):

# Open the files
ds_static = ImportLUH2StaticFile(args.luh2_static_file)
filelist = [args.clm_surface_file,
args.clm_luhforest_file,
args.clm_luhpasture_file,
args.clm_luhother_file]
ds_landusepfts = []
for filename in filelist:
ds_landusepfts.append(ImportLandusePFTFile(filename))

# Add lat/lon coordinates to the CLM5 landuse data
for dataset in ds_landusepfts:
AddLatLonCoordinates(dataset)

# Define static luh2 ice/water mask
mask_icwtr = DefineStaticMask(ds_static)

# Calculate the bareground percentage after initializing data array list
# Normalize the percentages
percent = []
percent_bareground = ds_landusepfts[0].PCT_NAT_PFT.isel(natpft=0)
percent_bareground = (percent_bareground / 100.0) * mask_icwtr
percent.append(percent_bareground)

# Renormalize the PCT_NAT_PFT for each dataset using the mask
for data_array in ds_landusepfts:
percent.append(RenormalizePFTs(data_array))

# Calculate the primary and secondary PFT fractions as the forest
# and nonforest-weighted averages of the forest and other PFT datasets.
percent[2] = ds_static.fstnf * percent[2] + (1. - ds_static.fstnf) * percent[-1]

# Note that the list order is:
# bareground, surface data, primary, pasture, rangeland (other)
ds_var_names = ['frac_brgnd','frac_csurf','frac_primr','frac_pastr','frac_range']

# Open and prepare the target dataset
ds_target = ImportRegridTarget(args.regrid_target_file)

# Set the mask for the regrid target
ds_target = SetMaskRegridTarget(ds_target)

# Create an output dataset to contain individually regridded landuse percent datasets
ds_output = xr.Dataset()

# Loop through percentage list and regrid each entry
for index,data_array in enumerate(percent):

# Get the name for the new variable
varname = ds_var_names[index]

# Convert current percent data array into temporary dataset
ds_percent = data_array.to_dataset(name=varname)

# Apply mask for the current dataset
ds_percent['mask'] = mask_icwtr
if (varname != 'frac_brgnd'):
mask_zeropercent = xr.where(ds_percent[varname].sum(dim='natpft') == 0.,0,1)
ds_percent['mask'] = ds_percent.mask * mask_zeropercent

# Regrid current dataset
print('Regridding {}'.format(varname))
regridder = xe.Regridder(ds_percent, ds_target, "conservative_normed")
ds_regrid = regridder(ds_percent)

# Drop mask to avoid conflicts when merging
if (varname != 'frac_brgnd'):
# There is no mask currently on the bareground
ds_regrid = ds_regrid.drop_vars(['mask'])

# Append the new dataset to the output dataset
ds_output = ds_output.merge(ds_regrid)

# Duplicate the 'primary' data array into a 'secondary' data array. Eventually
# this will contain different data from a future CLM landuse x pft update
ds_output['frac_secnd'] = ds_output.frac_primr.copy(deep=True)

# ds_regrid = ds_regrid.rename_dims(dims_dict={'lat':'lsmlat','lon':'lsmlon'})

# Output dataset to netcdf file
print('Writing fates landuse x pft dataset to file')
output_file = os.path.join(os.getcwd(),args.output)
ds_output.to_netcdf(output_file)

if __name__ == "__main__":
main()
31 changes: 31 additions & 0 deletions tools/landusedata/src/landusedata/landusepftmod.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
import xarray as xr

# Open the CLM5 landuse x pft data file
def ImportLandusePFTFile(filename):
dataset = xr.open_dataset(filename)

# Check to see if the imported dataset has the the percent
# natural pft variable present.
if 'PCT_NAT_PFT' not in list(dataset.var()):
raise TypeError("incorrect file, must be CLM5 landuse file")

# change the percent natural pft from single to double precision
# for downstream calculations
dataset['PCT_NAT_PFT'] = dataset.PCT_NAT_PFT.astype('float64')
return dataset

# Add lat/lon coordinates to the CLM5 landuse dataset
# While the lat and lon are available as variables, they are
# not defined as 'coords' in the imported dataset
def AddLatLonCoordinates(dataset):
dataset['lon'] = dataset.LON
dataset['lat'] = dataset.LAT
return dataset

# Renormalize the pft percentages without the bareground
def RenormalizePFTs(dataset):
# Remove the bareground pft index from the dataset
percent = dataset.PCT_NAT_PFT.isel(natpft=slice(1,None))
# Normalize
percent = percent / percent.sum(dim='natpft')
return percent
Loading