Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help me run / debug this pipeline #1

Closed
rabernat opened this issue Oct 4, 2020 · 4 comments
Closed

Help me run / debug this pipeline #1

rabernat opened this issue Oct 4, 2020 · 4 comments

Comments

@rabernat
Copy link
Contributor

rabernat commented Oct 4, 2020

I just took a crack at writing a new pipeline. I based this off my original example: https://github.com/pangeo-forge/pangeo-forge/blob/master/examples/oisst-avhrr-v02r01.py

It felt like coding blind, because I did not know how to interactively debug my flow.

It was also confusing to have runtime stuff (prefect / dask settings) intermixed into the recipe. I wasn't quite sure what I could remove and what was necessary.

So I did my best, but I'm sure it doesn't work.

The action also failed on the "Authenticate with Docker" step:
https://github.com/pangeo-forge/noaa-oisst-avhrr-feedstock/runs/1204219150?check_suite_focus=true

Would appreciate some help from @TomAugspurger / @jhamman.

@TomAugspurger
Copy link
Contributor

I’m happy to sit down and go through this sometime this week. I’m guessing my schedule is more open than others’.

@TomAugspurger
Copy link
Contributor

Answering a few questions ahead of time:

It was also confusing to have runtime stuff (prefect / dask settings) intermixed into the recipe. I wasn't quite sure what I could remove and what was necessary.

My hope is that all of the environment and storage pieces can be moved left out of the majority of recipes:

# Very confusing to have this in the middle of the recipe!
# Feels totally out of place.
@property
def environment(self):
environment = DaskKubernetesEnvironment(
min_workers=1, max_workers=30,
scheduler_spec_file="recipe/job.yaml",
worker_spec_file="recipe/worker_pod.yaml",
)
return environment
@property
def storage(self):
storage = Docker(
"pangeoforge",
dockerfile="recipe/Dockerfile",
prefect_directory="/home/jovyan/prefect",
python_dependencies=[
"git+https://github.com/pangeo-forge/pangeo-forge@master",
"prefect==0.13.6",
],
image_tag="latest",
)
return storage

That can (hopefully) just be in the base class.

The "runtime" components at

# why are we calling this in the recipe file?
pipeline = TerraclimatePipeline(cache_location, target_location, variables, years)
# feel like this should happen in the runtime, not in the recipe definiton

and

if __name__ == "__main__":
pipeline.flow.validate()
print(pipeline.flow)
print(pipeline.flow.environment)
print(pipeline.flow.parameters)
print(pipeline.flow.sorted_tasks())
print("Registering Flow")
pipeline.flow.register(project_name="pangeo-forge")

can maybe be removed. IIUC, the Python module we send to prefect has to have a Flow instance at the top level of the module. I imagine that users could write just the class in their pipeline.py and the pangeo-forge infrastructure could import the flow from the pipeline and handle all the registration, outside of what the user needs to see.

The action also failed on the "Authenticate with Docker" step:
https://github.com/pangeo-forge/noaa-oisst-avhrr-feedstock/runs/1204219150?check_suite_focus=true

Most likely Joe set up the secrets just on the terraform repo. They should probably be organization-wide.

@rabernat
Copy link
Contributor Author

Update. @TomAugspurger and I sprinted today and we got this feedstock to run! 🚀

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Oct 12, 2020

*Run up to an error :) But progress!

12 October 2020,02:22:14 	prefect.CloudTaskRunner	INFO	Task 'combine_and_write': Starting task run...
12 October 2020,02:22:15 	prefect.CloudTaskRunner	ERROR	Unexpected error: NameError("name 'concat_dim' is not defined")
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/prefect/engine/runner.py", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/prefect/engine/task_runner.py", line 823, in get_task_run_state
    self.task.run, timeout=self.task.timeout, **raw_inputs
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/prefect/utilities/executors.py", line 188, in timeout_handler
    return fn(*args, **kwargs)
  File "recipe/pipeline.py", line 70, in combine_and_write
NameError: name 'concat_dim' is not defined
12 October 2020,02:22:15 	prefect.CloudTaskRunner	INFO	Task 'combine_and_write': finished task run for task with final state: 'Failed'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants