Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Cannot find how to type pipeline parameters for JsonArray or JsonObject #7719

Closed
dboshardy opened this issue May 13, 2022 · 7 comments
Closed
Labels

Comments

@dboshardy
Copy link

What steps did you take

I'm trying to utilize the ModelBatchPredictOp from the GCP Components. There are two fields, one with a type of JsonObject and the other with the type JsonArray. The pipeline accepts the fields for this operator as input parameters to the pipeline and passes them to the operator in the constructor. Since they're in the pipeline params, they must be typed. However, I am unable to find any type configuration that coerces to JsonArray or JsonObject, and I cannot find any documentation on how to do this.

How do I pass these values?

For example, gcs_source_uris is supposed to be a JsonArray. When passing this through as a pipeline param typed to List[str] it fails to compile with this error:

kfp.dsl.types.InconsistentTypeException: Incompatible argument passed to the input "gcs_source_uris" of component "model_batch_predict": Argument type "typing.List[str]" is incompatible with the input type "JsonArray"
@kfp.v2.dsl.pipeline(
    name="batch-predict-v0",
    pipeline_root=pipeline_gcs_root_path)
def batch_predict_pipeline(
                           ...
                           gcs_source_uris: List[str],
                           ...
                           ):


        predict_op = ModelBatchPredictOp(
            ...
            gcs_source_uris=gcs_source_uris,
            ....
        )

I understand the error, and I've exhausted any other potential types I could think of. I've also tried to find any references to JsonObject or JsonArray in the SDK and none of them reference any Python classes I can import to type these parameters as. I'm gathering this is some kind of protobuf type, but when trying to use that, it still fails.
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@dboshardy
Copy link
Author

dboshardy commented May 13, 2022

I recognize this isn't a bug, however I do not know what else to describe it as, nor where to find this information. I've tried the KFP documentation and the myriad of oddly organized and incomplete GCP documentation on Vertex AI Pipelines.

@dboshardy
Copy link
Author

For anyone who comes across this, jsonobject is the package that contains the classes required. This should be in the documentation somewhere.

@Ark-kun
Copy link
Contributor

Ark-kun commented Jul 22, 2022

@dboshardy

Python's list type is mapped to JsonArray
Python's dict type is mapped to JsonObject.

kfp.dsl.types.InconsistentTypeException: Incompatible argument passed to the input "gcs_source_uris" of component "model_batch_predict": Argument type "typing.List[str]" is incompatible with the input type "JsonArray"

typing.List[str] should work the same as list, but there might be newly introduced bugs in v2 that break that.

For anyone who comes across this, jsonobject is the package that contains the classes required. This should be in the documentation somewhere.

The jsonobject package has no relation to the JsonObject type name in KFP components.

@dboshardy
Copy link
Author

I don't know what to say other than using lists as the type did not work but using jsonobject.JsonArray did. Maybe this has changed in later versions.

@Ark-kun
Copy link
Contributor

Ark-kun commented Jul 24, 2022

The param: list annotations has worked since the first releases of KPF. Maybe KFPv2 has disrupted something recently for some time.
In any case, specifying arbitrary type is very easy - just annotate with the type name
param: "JsonArray"
Or
some_path: InputPath("JsonArray")
This works with all types.

@cgparkinson
Copy link

Thanks, changing my PipelineParam type hint from List[str] to list worked.

@MinhManPham
Copy link

I have a question: how can we use the for loop for gcs_source_uris at batch_predict_pipeline, not inside the component which was passed. Thanks

RobbeSneyders pushed a commit to ml6team/fondant that referenced this issue May 10, 2023
This PR enables the user to pass different `bool`, `lists` and `dict` to
a component. Kubeflow typically handles those arguments by serializing
them as a string . For this reason, they need to be de-serialized again
within the component in order for them to be properly handled.

This might go away once we move to V2. 

References to the issue: 
kubeflow/pipelines#7457
kubeflow/pipelines#7719
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants