-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Failed to parse the container spec json payload to requested prototype" within CustomTrainingJobOp #419
Comments
@TheWispy Hello, could you paste in your pipeline definition? As per PipelineParams are not JSON serializable. That itself is correct. While a pipeline definition looks like Python code, it is not actually Python code. The parameters you pass into the Pipeline definition are of type PipelineParam and have some limitations -- such as one cannot JSON serialize their values. |
Hi there! I managed to get this working oddly by removing the "timeout" definition, but the JSON serialisation issue remained. With regards to the pipeline parameters, I originally wanted the string from the parameter to be passed to my training container as an argument, but no matter how hard I tried and reconfigured it wouldn't compile. |
@TheWispy Thanks for the update. Can you paste in the pipeline code which won't compile? |
1 similar comment
@TheWispy Thanks for the update. Can you paste in the pipeline code which won't compile? |
been over a month with no response. closing issue by policy |
Hi @andrewferlitsch ! I have the same issue with my pipeline. It appears when I'm constructing a worker_pool_spec in a different component and then returning it in the CustomTrainingJobOp. Here is the code sample: from the pipeline: get_worker_dict_task = (
get_worker_dict(
split_data_task.outputs["df_train"], BUCKET_URI
)
.set_caching_options(False)
.set_display_name("get worker dict task")
)
worker_pool_specs_list = get_worker_dict_task.outputs["worker_dict"]
custom_job_task = CustomTrainingJobOp(
project=project,
display_name="model-training",
worker_pool_specs=worker_pool_specs_list, # worker_pool_specs_list, # worker_pool_spec_ld,
base_output_directory=MODEL_DIR,
location=REGION,
).after(get_worker_dict_task)` the component to construct the dict: @component(
base_image="actual_image_hidden_due_to_privacy",
output_component_file="create_worker_pool_spec_dict.yaml",
)
def get_worker_dict(
df_train: Input[Dataset], BUCKET_URI: str
) -> NamedTuple("Outputs", [("worker_dict", list)],):
import json
import pandas as pd
from collections import namedtuple
data_uri = "gs://" + df_train.path.rsplit("/gcs/")[1].replace("//", "/")
data_path = data_uri + ".csv"
worker_pool_spec_dict = [
{
"machineSpec": {
"machineType": "e2-standard-4",
"acceleratorType": "ACCELERATOR_TYPE_UNSPECIFIED",
"acceleratorCount": 0,
},
"replicaCount": "1",
"containerSpec": {
"imageUri": "my_image_actuall_imahe_hiden_due_to_privacy_in_this_code_sample",
"args": ["--dataset_url", data_path],
"env": [{"name": "AIP_MODEL_DIR", "value": BUCKET_URI}],
},
}
]
example_output = namedtuple("Outputs", ["worker_dict"])
return example_output(worker_pool_spec_dict) |
Expected Behavior
Actual Behavior
TypeError: Object of type PipelineParam is not JSON serializable
as seen here. For a parameter being passed to a training operation, this really doesn't make any sense.If all pipiline params are removed before compilation, the Vertex component fails with the following error.

The full redacted object dump is here.
This is despite me following the guide as described here, which seems a little outdated in places? Any help would be greatly appreciated. Cheers!
Steps to Reproduce the Problem
Specifications
The text was updated successfully, but these errors were encountered: