Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kfp client returns HTTP 500 error, even after the pipeline run succeeds #788

Closed
nareshganesan opened this issue Feb 6, 2019 · 4 comments

Comments

@nareshganesan
Copy link

When I'm creating a pipeline run using kfp client. I'm able to create my pipeline and the workflow controller has no error message and the pipeline succeeds, but the kfp client response has the following HTTP 500 error. I'm really not sure, how to trouble shoot. Please help.

This is how I built my kfp python sdk.

cd $PIPELINE/sdk/python
wget http://central.maven.org/maven2/io/swagger/swagger-codegen-cli/2.3.1/swagger-codegen-cli-2.3.1.jar -O /tmp/swagger-codegen-cli.jar
apt-get install --no-install-recommends -y -q default-jdk
./build.sh /tmp/kfp.tar.gz
pip3 install /tmp/kfp.tar.gz
Traceback (most recent call last):
  File "dag-function.py", line 311, in <module>
    createPipeline(experiment_id, job_name, workflowYaml=workflowYaml)
  File "dag-function.py", line 276, in createPipeline
    response = client._run_api.create_run(body=run_body)
  File "/home/naresh/miniconda3/lib/python3.6/site-packages/kfp_run/api/run_service_api.py", line 54, in create_run
    (data) = self.create_run_with_http_info(body, **kwargs)  # noqa: E501
  File "/home/naresh/miniconda3/lib/python3.6/site-packages/kfp_run/api/run_service_api.py", line 131, in create_run_with_http_info
    collection_formats=collection_formats)
  File "/home/naresh/miniconda3/lib/python3.6/site-packages/kfp_run/api_client.py", line 322, in call_api
    _preload_content, _request_timeout)
  File "/home/naresh/miniconda3/lib/python3.6/site-packages/kfp_run/api_client.py", line 153, in __call_api
    _request_timeout=_request_timeout)
  File "/home/naresh/miniconda3/lib/python3.6/site-packages/kfp_run/api_client.py", line 365, in request
    body=body)
  File "/home/naresh/miniconda3/lib/python3.6/site-packages/kfp_run/rest.py", line 275, in POST
    body=body)
  File "/home/naresh/miniconda3/lib/python3.6/site-packages/kfp_run/rest.py", line 228, in request
    raise ApiException(http_resp=r)
kfp_run.rest.ApiException: (500)
Reason: Internal Server Error
HTTP response headers: HTTPHeaderDict({'x-powered-by': 'Express', 'content-type': 'application/json', 'trailer': 'Grpc-Trailer-Content-Type', 'date': 'Tue, 05 Feb 2019 06:47:04 GMT', 'x-envoy-upstream-service-time': '33', 'server': 'envoy', 'transfer-encoding': 'chunked'})
HTTP response body: {"error":"Failed to create a new run.: InternalServerError: Failed to store resource references to table for run dynamic-dag-5txpq : ResourceNotFoundError: Experiment 6SNCWYTPJV not found.","message":"Failed to create a new run.: InternalServerError: Failed to store resource references to table for run dynamic-dag-5txpq : ResourceNotFoundError: Experiment 6SNCWYTPJV not found.","code":13,"details":[{"@type":"type.googleapis.com/api.Error","error_message":"Internal Server Error","error_details":"Failed to create a new run.: InternalServerError: Failed to store resource references to table for run dynamic-dag-5txpq : ResourceNotFoundError: Experiment 6SNCWYTPJV not found."}]}

@swiftdiaries - I'll update details about listing experiments through sdk soon

Thanks for helping out!

@neuromage
Copy link
Contributor

@nareshganesan What's the actual client command you issued? Can you paste that as well? I suspect this has something to do with creating a run outside an experiment in the python client. I will take a look.

@neuromage
Copy link
Contributor

/assign @neuromage

@nareshganesan
Copy link
Author

@neuromage ,

Yes, I'm using the following snippet.

  ...
  ...
  ...
  client = kfp.Client(clstrURL)
  pipeline_json_string = json.dumps(workflowYaml)
  api_params = [kfp_run.ApiParameter(name=_k8s_helper.K8sHelper.sanitize_k8s_name(k), value=str(v))
                for k,v in params.items()]
  key = kfp_run.models.ApiResourceKey(id=experiment_id,
                                      type=kfp_run.models.ApiResourceType.EXPERIMENT)
  reference = kfp_run.models.ApiResourceReference(key, kfp_run.models.ApiRelationship.OWNER)
  spec = kfp_run.models.ApiPipelineSpec(
      workflow_manifest=pipeline_json_string, parameters=api_params)
  run_body = kfp_run.models.ApiRun(
      pipeline_spec=spec, resource_references=[reference], name=job_name)

  config = kfp_run.configuration.Configuration()
  config.host = clstrURL if clstrURL else Client.IN_CLUSTER_DNS_NAME
  api_client = kfp_run.api_client.ApiClient(config)
  client._run_api = kfp_run.api.run_service_api.RunServiceApi(api_client)
  response = client._run_api.create_run(body=run_body)
  ...
  ...

Thanks for helping out!

@neuromage
Copy link
Contributor

I think this has been fixed. @nareshganesan feel free to reopen if this is still a problem and I can re-assign this if needed. Thanks.

magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this issue Oct 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@neuromage @nareshganesan @vicaire and others