-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support summation for PipelineML
#554
Comments
PipelineML
summablePipelineML
I can see the value of this, but this is a very difficult problem, because :
mostly because summing / filtering pipelines breaks the assumption of a single input, and all inputs of inference are outputs of training. You need to know exactly what you are doing to ensure your pipeline is indeed a PipelineML. I could eventually let summation happen, and have an error message in case of invalid output, but I am afraid this will raise even more questions... |
Yeah, that makes perfect sense. I think we should leave it as it is. And you could mention this discussion in the code itself to verbalize this design decision. |
I've palyed a little bit with it and it's possible to add one As discussed, I'll just document the decision |
Closed by #584 |
Description
This feature would simplify pipeline ensembling a lot if PipelineML would implement
__sum__
and other dunder methods.Here's a simplified example. We have
ds
pipeline defined asdata_prep
+training
+report_training
, where actual model training happens in thetraining
. All four are exposed in the model registry. So users can trigger training pipelines by calling eitherds
ortraining
. Given I can't append the result ofpipeline_ml_factory
to defineds
, I should wrap bothds
andtraining
with ml wrapper:Current code:
Desired code:
Of course the alternative here is to extract this piece of code and do wrapping for each of the functions that contain training step; but that'd require manual tracking each time the pipeline is updated opposed to having only training step wrapped into this new class.
I can see that this is the behavior by design. Is there any specific reason why this limitation is imposed?
Possible Implementation
Implement dunder methods of
PipelineML
The text was updated successfully, but these errors were encountered: