Updated following review comments, and some minor additional edits to…

… Recipes (pangeo-forge#483)
derekocallaghan · Mar 14, 2023 · e594b76 · e594b76
1 parent 196bd0e
commit e594b76
Show file tree

Hide file tree

Showing 3 changed files with 13 additions and 9 deletions.
diff --git a/docs/pangeo_forge_recipes/recipe_user_guide/execution.md b/docs/pangeo_forge_recipes/recipe_user_guide/execution.md
@@ -25,5 +25,9 @@ with beam.Pipeline() as p:
     p | transforms
 ```
 
-By default the pipeline runs using Beam's [DirectRunner](https://beam.apache.org/documentation/runners/direct/).
-See [runners](https://beam.apache.org/documentation/#runners) for more details.
+By default the pipeline runs using Beam's [DirectRunner](https://beam.apache.org/documentation/runners/direct/), which is useful during recipe development. However, alternative Beam runners are available, for example:
+* [FlinkRunner](https://beam.apache.org/documentation/runners/flink/): execute Beam pipelines using [Apache Flink](https://flink.apache.org/).
+* [DataflowRunner](https://beam.apache.org/documentation/runners/dataflow/): uses the [Google Cloud Dataflow managed service](https://cloud.google.com/dataflow/service/dataflow-service-desc).
+* [DaskRunner](https://beam.apache.org/releases/pydoc/current/apache_beam.runners.dask.dask_runner.html): executes pipelines via [Dask.distributed](https://distributed.dask.org/en/stable/).
+
+See [here](https://beam.apache.org/documentation/#runners) for details of the available Beam runners.
diff --git a/docs/pangeo_forge_recipes/recipe_user_guide/recipes.md b/docs/pangeo_forge_recipes/recipe_user_guide/recipes.md
@@ -21,15 +21,15 @@ to {doc}`../../pangeo_forge_cloud/index`, which allows the recipe to be automati
 
 ## Recipe Pipelines
 
-A recipe is defined as a [pipeline](https://beam.apache.org/documentation/programming-guide/#creating-a-pipeline) of [Apache Beam transforms](https://beam.apache.org/documentation/programming-guide/#transforms) applied to the data collection associated with a {doc}`file pattern <file_patterns>`. Specifically, each recipe pipeline contains a set of transforms, which operate on an `apache_beam.PCollection`, performing a one-to-one mapping using `apache_beam.Map` of input elements to output elements, applying the specified transformation.
+A recipe is defined as a [pipeline](https://beam.apache.org/documentation/programming-guide/#creating-a-pipeline) of [Apache Beam transforms](https://beam.apache.org/documentation/programming-guide/#transforms) applied to the data collection associated with a {doc}`file pattern <file_patterns>`. Specifically, each recipe pipeline contains a set of transforms, which operate on an [`apache_beam.PCollection`](https://beam.apache.org/documentation/programming-guide/#pcollections), performing a mapping of input elements to output elements (for example, using [`apache_beam.Map`](https://beam.apache.org/documentation/transforms/python/elementwise/map/)), applying the specified transformation.
 
 To write a recipe, you define a pipeline that uses existing transforms, in combination with new transforms if required for custom processing of the input data collection.
 
 Right now, there are two categories of recipe pipelines based on a specific data model for the input files and target dataset format.
 In the future, we may add more.
 
 ```{note}
-The full API Reference documentation for the existing recipe `PTransform` implementations can be found at
+The full API Reference documentation for the existing recipe `PTransform` implementations ({class}`pangeo_forge_recipes.transforms`) can be found at
 {doc}`../api_reference`.
 ```
 
@@ -62,10 +62,10 @@ Below we give a very basic overview of how this recipe is used.
 First you must define a {doc}`file pattern <file_patterns>`.
 Once you have a {class}`FilePattern <pangeo_forge_recipes.patterns.FilePattern>` object,
 the recipe pipeline will contain at a minimum the following transforms applied to the file pattern collection:
-* `OpenURLWithFSSpec`: retrieves each pattern file using the specified URLs.
-* `OpenWithXarray`: load each pattern file into an `xarray.Dataset`:
+* {class}`pangeo_forge_recipes.transforms.OpenURLWithFSSpec`: retrieves each pattern file using the specified URLs.
+* {class}`pangeo_forge_recipes.transforms.OpenWithXarray`: load each pattern file into an [`xarray.Dataset`](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html):
   * The `file_type` is specified from the pattern.
-* `StoreToZarr`: generate a Zarr store by combining the datasets:
+* {class}`pangeo_forge_recipes.transforms.StoreToZarr`: generate a Zarr store by combining the datasets:
   * `store_name` specifies the name of the generated Zarr store.
   * `target_root` specifies where the output will be stored, in this case, the temporary directory we created.
   * `combine_dims` informs the transform of the dimension used to combine the datasets. Here we use the dimension specified in the file pattern (`time`).

diff --git a/docs/pangeo_forge_recipes/recipe_user_guide/storage.md b/docs/pangeo_forge_recipes/recipe_user_guide/storage.md
@@ -50,7 +50,7 @@ transforms = (
     | OpenURLWithFSSpec()
     | OpenWithXarray(file_type=pattern.file_type)
     | StoreToZarr(
-        store_name=my-dataset-v1.zarr,
+        store_name="my-dataset-v1.zarr",
         target_root=target_root,
         combine_dims=pattern.combine_dim_keys,
         target_chunks={"time": 10}
@@ -74,7 +74,7 @@ transforms = (
     | OpenURLWithFSSpec(cache=cache)
     | OpenWithXarray(file_type=pattern.file_type)
     | StoreToZarr(
-        store_name=my-dataset-v1.zarr,
+        store_name="my-dataset-v1.zarr",
         target_root=target_root,
         combine_dims=pattern.combine_dim_keys,
         target_chunks={"time": 10}