Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK - Components - Hiding signature attribute from CloudPickle #2045

Conversation

Ark-kun
Copy link
Contributor

@Ark-kun Ark-kun commented Sep 5, 2019

The high-level user issue this solves is that some converted Airflow operators fail to work due to unpickling errors.
Cloudpickle has some issues with pickling type annotations in python versions < 3.7, so they disabled it. cloudpipe/cloudpickle#196
create component_from_airflow_op spoofs the function signature by setting the func.__signature__ attribute. cloudpickle then tries to pickle that attribute which leads to failures during unpickling.
To prevent this we remove the .__signature__ attribute before pickling.


This change is Reviewable

Cloudpickle has some issues with pickling type annotations in python versions < 3.7, so they disabled it. cloudpipe/cloudpickle#196
`create component_from_airflow_op` spoofs the function signature by setting the `func.__signature__` attribute. cloudpickle then tries to pickle that attribute which leads to failures during unpickling.
To prevent this we remove the `.__signature__` attribute before pickling.
@Ark-kun
Copy link
Contributor Author

Ark-kun commented Sep 5, 2019

/retest

        # Hack to prevent cloudpickle from trying to pickle generic types that might be present in the signature. See cloudpipe/cloudpickle#196 
        # Currently the __signature__ is only set by Airflow components as a means to spoof/pass the function signature to _func_to_component_spec
@Ark-kun
Copy link
Contributor Author

Ark-kun commented Sep 5, 2019

/retest

2 similar comments
@Ark-kun
Copy link
Contributor Author

Ark-kun commented Sep 6, 2019

/retest

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Sep 6, 2019

/retest

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Sep 6, 2019

@numerology Can you please take a look?

@numerology
Copy link

/lgtm

@numerology
Copy link

I guess we might want to create a issue to track this hacking work-around. Once cloudpickle solves their problem we can switch to a more elegant way. WDYT?

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Sep 6, 2019

Once cloudpickle solves their problem we can switch to a more elegant way.

They tried several times and ultimately failed to handle it for python <3.7. So they disabled this for python <3.7. But it looks like they only disabled it for the actual function signature and not the normal attributes.

Once cloudpickle solves their problem we can switch to a more elegant way. WDYT?

I'll think of a way to solve this. But generally it's just a way to make _func_to_component_spec see some extra information, but hide it from cloudpickle (we do not want to serialize that extra information).

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Sep 6, 2019

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 6c15f27 into kubeflow:master Sep 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants