-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CESM2-LE pipeline #53
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Awesome, thanks @mgrover1! Recapping here for clarity that our plan was to "skip" caching, because you already have access to all of the source files on GLADE. To implement this, we initially instantiated your source file directory as a In e377b76, I changed your source file target to an instance of So I'm curious, if you execute your recipe from this updated execution notebook, do you still get a |
thanks @cisaacstern ! Now I am running into this
when running |
Progress! (I hope 😄)
Can you provide a full Traceback? |
|
I think you want |
Adding this in instead
results in
|
Amazing catch. And oops! This wouldn't have come up for Max if I had resolved: pangeo-forge/pangeo-forge-recipes#135 (comment) 🤭
Yep, that's expected because there is also one other issue here, which is that we haven't actually cached any metadata (because we skipped caching). I am about to push a commit which should address this. |
@mgrover1, I don't know if it would've worked to cache metadata to a Then, before preparing the target, I've added: for input_name in recipe.iter_inputs():
recipe.cache_input_metadata(input_name) Can you see where running these changes before the call to |
@cisaacstern we are in business 😊😊😊 |
Now the question is:
The first question may be able to be solved within the |
d1193c1 adds the |
When running this, I run into the following warning
Is this something to be concerned about when running this on the larger 1 TB+ zarr stores I plan on running this on? |
Here is an example of what the def make_filename(component, frequency, variable, experiment, forcing, experiment_number, member_id, stream, time):
return f"/glade/campaign/cgd/cesm/CESM2-LE/timeseries/{component}/proc/tseries/{frequency}/{variable}/b.e21.{experiment}{forcing}.f09_g17.LE2-{experiment_number}.{member_id}.{stream}.{variable}.{time}.nc" |
I've never seen this before. Seems like a question for @TomAugspurger or @rabernat.
Are these mirrored across every one of the source files? If so, you may be able to create a separate recipe for them and write them only once. Here it's worth noting that your # ... define recipes above, then ...
recipes = {
"historical/atm": historical_atm_recipe, # each dict value is a XarrayZarrRecipe instance
"ssp370/atm": ssp370_atm_recipe,
"grid": grid_recipe,
} Then in the execution notebook: from cesm_le2_recipe import recipes
for input_name in recipes["historical_atm"].iter_inputs():
recipes["historical_atm"].cache_input_metadata(input_name)
# ... etc. ...
Yes! You can add dimensional complexity to your recipe by parameterizing additional components of the path returned from Then, each of these parameters (aside from time) becomes it's own
Yep, you do need to parameterize these in the |
@mgrover1, I note in #53 (comment) that you've given a Assuming this refers to temporal resolution (monthly, daily, etc.), then each |
Yes - the zarr stores will be separated by component/frequency/cesm2-le.experiment.forcing.variable.zarr |
How's this going, @mgrover1? Anything we can troubleshoot or is everything working as desired? |
Just pinging this PR. Is this recipe still viable? Could we run it in our bakery? |
Closes #51
I added a couple files which @cisaacstern worked through this morning. This is preliminary for now, and this can only be run within the GLADE filesystem at NCAR since the data are there, but I am hoping this will at least provide an example!