You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you pass a parameter into your pipeline from the UI now, the only thing you can do with it is to pass it as a startup argument to a Docker container. This prevents them from being used to control execution flow, which is a very likely use case. For example, if I have a pipeline that downloads data, scrubs it, use it in training and then publishes a model, I'm very likely to want to repeat the training and publishing multiple times without wanting to repeat the initial download and scrub. If the pipeline logic could access the parameters, one could pass in a "starting step" parameter for runs to tell them not to bother re-downloading or re-scrubbing data that has already been downloaded, scrubbed and stored.
If it isn't possible (or advisable) to change the pipeline parameters to make them accessible from the actual pipeline logic, then can we at least have a standard Docker image that reads in arguments in key/value format (e.g. "name1=value1 name2=value2") and drops all the values into output parameters with the requested names? Then people could run that as the first step in their pipelines, passing in the pipeline parameters and accessing them as output variables from that step for the rest of the pipeline.
The text was updated successfully, but these errors were encountered:
Of course, it turns out that the output parameters are exactly the same as the input parameters, so the Docker image approach won't work. There should be some way for the user running the pipeline to pass data into the pipeline at runtime that can be used by the pipeline's Python.
If you pass a parameter into your pipeline from the UI now, the only thing you can do with it is to pass it as a startup argument to a Docker container. This prevents them from being used to control execution flow, which is a very likely use case. For example, if I have a pipeline that downloads data, scrubs it, use it in training and then publishes a model, I'm very likely to want to repeat the training and publishing multiple times without wanting to repeat the initial download and scrub. If the pipeline logic could access the parameters, one could pass in a "starting step" parameter for runs to tell them not to bother re-downloading or re-scrubbing data that has already been downloaded, scrubbed and stored.
If it isn't possible (or advisable) to change the pipeline parameters to make them accessible from the actual pipeline logic, then can we at least have a standard Docker image that reads in arguments in key/value format (e.g. "name1=value1 name2=value2") and drops all the values into output parameters with the requested names? Then people could run that as the first step in their pipelines, passing in the pipeline parameters and accessing them as output variables from that step for the rest of the pipeline.
The text was updated successfully, but these errors were encountered: