-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Bring back azureml.pipeline.steps.python_script_step.PythonScriptStep(hash_paths= ...)
#19003
Comments
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github. Issue DetailsCross post from #18182 (comment) Is your feature request related to a problem? Please describe. For future releases, I'd like to see the return of an old, deprecated feature in the Azure Python SDK. It would be great to use Below is a use case I have, and a use case that's fairly practical for certain situations. .
├── pipeline
│ ├── aml_process.py # GOAL 2 -use PythonScriptStep (allow_reuse=True , source_directory='./../', script_name='./pipeline/step_1/math_check.py', hash_paths = './pipeline/step_1' …,… )
│ ├── step_1
│ │ └── math_check.py # GOAL 1A - import from src/math.py & src/helper.py at runtime
│ └── step_2
│ └── calculation.py. # GOAL 1B - import from src/helper.py at runtime
├── requirements.txt
└── src
├── helper.py
└── math.py From my two goals above, I have them within a repository with source and pipeline code to run. For goal 1 , I want to import For goal 2, I want to use In reality, I might have 6 sub-steps in a repository. So, the value of Describe the solution you'd like Un-depreciate Describe alternatives you've considered From my code snippet, I have considered splitting all code into two repositories (
Additional context
|
Thanks for the feedback, we’ll investigate asap. |
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @shbijlan. Issue DetailsCross post from #18182 (comment) Is your feature request related to a problem? Please describe. For future releases, I'd like to see the return of an old, deprecated feature in the Azure Python SDK. It would be great to use Below is a use case I have, and a use case that's fairly practical for certain situations. .
├── pipeline
│ ├── aml_process.py # GOAL 2 -use PythonScriptStep (allow_reuse=True , source_directory='./../', script_name='./pipeline/step_1/math_check.py', hash_paths = './pipeline/step_1' …,… )
│ ├── step_1
│ │ └── math_check.py # GOAL 1A - import from src/math.py & src/helper.py at runtime
│ └── step_2
│ └── calculation.py. # GOAL 1B - import from src/helper.py at runtime
├── requirements.txt
└── src
├── helper.py
└── math.py From my two goals above, I have them within a repository with source and pipeline code to run. For goal 1 , I want to import For goal 2, I want to use In reality, I might have 6 sub-steps in a repository. So, the value of Describe the solution you'd like Un-depreciate Describe alternatives you've considered From my code snippet, I have considered splitting all code into two repositories (
Additional context
|
@sergey-ivanchuk Apologies for the late reply. We are looking into this issue and we will provide an update once we have more details on this. @bandsina @shbijlan @likebupt Could you please look into this and provide an update once you get a chance ? Awaiting your reply. |
@sergey-ivanchuk Thanks for your feedback. This is a valid scenario. As we are developing new SDK version, I will add this request to the backlog. For this old SDK version, we will not do a new investment on it. From my understanding, you will use a single big repo to manage the pipeline, and steps in it. And when you built pipeline and steps you will use root folder for this repo. By default, we will use the whole folder to calculate the code hash to decide re-use. In this scenario, step2 changes will impact the step1 re-use verse wise. Provide capability to let customer provide the folders want to use for calculate code hash, will also introduce some issues, for example, in your case, only provide step_1 for hash will not be sufficient, as step_1 will also depends on src. So we will think this is advance use scenario we need to support. |
hi everyone, thanks for your recent follow-ups. @cloga , follow-up comments below:
Yes, exactly. Hypothetically, I could have a 5-step process and only want to re-run steps 5 (model training)
Very good call-out. I would ideally wish to import from |
@cloga please add this feature request to the proper backlog. I'm closing this issue for now. |
Cross post from #18182 (comment)
Is your feature request related to a problem? Please describe.
For future releases, I'd like to see the return of an old, deprecated feature in the Azure Python SDK.
It would be great to use
azureml.pipeline.steps.python_script_step.PythonScriptStep(hash_paths= ...)
. This parameter was depreciated a long time ago, but I feel it would benefit the Azure SDK user community.Below is a use case I have, and a use case that's fairly practical for certain situations.
From my two goals above, I have them within a repository with source and pipeline code to run.
For goal 1 , I want to import
src
code. So, I need to makesource_directory='./../'
in thePythonScriptStep
functionFor goal 2, I want to use
allow_reuse=True
andhash_paths = './pipeline/step_1'
so that I can do hashing on multiple sub-steps in a pipeline (e.g. use case where I need to re-runstep_2
but still re-usestep_1
).In reality, I might have 6 sub-steps in a repository. So, the value of
hash_paths
goes up greatly. Only re-running 1-of-6 steps is much better than re-running 6-of-6Describe the solution you'd like
Un-depreciate
azureml.pipeline.steps.python_script_step. PythonScriptStep(hash_paths= ...)
Describe alternatives you've considered
From my code snippet, I have considered splitting all code into two repositories (
src
andpipelines
). This will meet my goal # 1 and goal # 2 from above. However, this will require more workarounds than I'd like to be responsible for. So, the code management side will be more than necessary .azureml.pipeline.steps.python_script_step.PythonScriptStep(hash_paths= ...)
will give greater control and leverage for re-using certain pipeline steps.Additional context
Nothing more to add.
The text was updated successfully, but these errors were encountered: