Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Replace and Route as Step Function with Lambdas #508

Merged
merged 9 commits into from
Aug 17, 2023

Conversation

shawncrawley
Copy link
Collaborator

This contains all of the changes necessary to run replace and route as a step function with a custom lambda.

New AWS Resources

  • hv-vpp-${var.environment}-execute-replace-route Step Function
  • **replace_route_${var.environment}** Lambda Function

New Database Items

  • wrds_rfcfcst schema (populated by wrds_rfcfcst foreign data server pointing to rfcfcst schema of ingest db)
  • rnr schema
  • rnr.nwm_crosswalk table (manually created by code found in a file committed herein (Core\LAMBDA\replace_route\lambda_content\sql\dba_stuff.sql)
  • rnr.nwm_routelink table (manually created by code found in both in Jupyter notebook [manual ingest of RouteLink.nc file] and a file committed herein (Core\LAMBDA\replace_route\lambda_content\sql\dba_stuff.sql) [for db table optimization]
  • rnr.nwm_lakeparm table (manually created by code found in both in Jupyter notebook and a file committed herein (Core\LAMBDA\replace_ro
  • rnr.staggered curves table (manually created by code found in a file committed herein (Core\LAMBDA\replace_route\lambda_content\sql\dba_stuff.sql)

The following schemas (followed by the dumped tables in parenthesis) have been dumped and uploaded to s3://hydrovis-ti-deployment-us-east-1/viz_db_dumps/:

  • wrds_rfcfcst (schema only, since the foreign db connection will be setup against this schema on deploy)
  • rnr (nwm_routelink, nwm_lakeparm, nwm_crosswalk)

The rnr.nwm_routelink and rnr.nwm_lakeparm tables should be manually updated anytime we update the underlying version of WRF-Hydro. The rnr.nwm_crosswalk and rnr.staggered_curves tables should technically be updated every time the wrds_location database is updated. We'll need to make a TODO item for that.

A new target_cols property can now be specified on an entry to the "ingest_files" section of the product_configs. This has been documented in the template.yml. The viz_initialize_pipeline lambda and viz_db_ingest lambdas were both updated to properly use this parameter. This was essential because replace and route requires that additional columns be present on the nwm_channel_rt_ana table. The common default columns that were previously hard-coded in the viz_db_ingest lambda are now used as defaults in viz_initialize_pipeline if the target_cols property is left blank.

At a high level, here is the workflow for the hv-vpp-${var.environment}-execute-replace-route Step Function:
image
All steps beginning with "Create Domain" are calling the replace_route_${var.environment} Lambda Function. When that function is called with "step": "domain", then SQL is executed to created the following dynamic, domain-specific tables in the rnr schema: temporal_domain_flow_forecasts, domain_forecasts, domain_routelink, and domain_lakeparm. Within the Map Function, the various dynamic WRF-Hydro input files are created by performing the appropriate baseline SQL query (largely referencing one of the dynamic domain-specific SQL tables mentioned above), and then converting the results of the query to a Pandas DataFrame, then to an xarray object, and then to a netcdf file. These files are then uploaded to an S3 bucket. Then WRF-Hydro is kicked off from the step function. This function was modified to pull the domain-specific files from S3 and then kick off WRF-Hydro in an otherwise normal fashion. Once complete, a signal is sent to back to the step function to proceed, at which point the Initialize Pipeline function is called, which will kick off the "replace_route" configuration for viz processing.

@shawncrawley
Copy link
Collaborator Author

Before I forget - I didn't have time to add the EventBridge trigger to Terrraform that kicks off the hv-vpp-${var.environment}-execute-replace-route Step Function every 15 minutes. If someone could do that for me, that'd be great. Otherwise, I'll get to it once I'm back on Monday.

Copy link
Contributor

@TylerSchrag-NOAA TylerSchrag-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow - This is impressive Shawn. Thank you for doing what I've only been able to speculate on for the last couple of years. This will be hugely helpful at diagnosing issues with these products.

One testing suggestion, if you haven't done this already: There are enough variables in play here, that it would be good to do an apples-to-apples comparison of the output of your new workflow vs. the old... either before merging if you have it setup to do that, or after merging into TI, comparing to UAT.

Way to go!
Tyler

@CoreyKrewson-NOAA
Copy link
Contributor

I added the eventbridge for a 15 minute kickoff and I also updated some of the folder/lambda names to align with the other functions as well.

Tyler, Shawn has been doing that as he went and has confirm that everything is working the same (and even better in some locations)

@CoreyKrewson-NOAA CoreyKrewson-NOAA merged commit 2f3cc00 into ti Aug 17, 2023
@CoreyKrewson-NOAA CoreyKrewson-NOAA added this to the V2.1.2 milestone Aug 23, 2023
@shawncrawley shawncrawley deleted the rnr-overhaul branch August 24, 2023 13:43
@shawncrawley shawncrawley restored the rnr-overhaul branch August 24, 2023 13:43
@shawncrawley shawncrawley deleted the rnr-overhaul branch August 24, 2023 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants