Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to use use_gcp_secret with value from PipelineParam #2089

Closed
mattnworb opened this issue Sep 11, 2019 · 8 comments
Closed

unable to use use_gcp_secret with value from PipelineParam #2089

mattnworb opened this issue Sep 11, 2019 · 8 comments
Assignees

Comments

@mattnworb
Copy link
Contributor

mattnworb commented Sep 11, 2019

What happened:

I am trying to write a pipeline where the secret name used with gcp.use_gcp_secret is a pipeline param defined at runtime.

I've tried a few variants, here is the main pipeline code:

import sys

import kfp
import kfp.compiler as compiler
import kfp.components as comp
import kfp.dsl as dsl
from kfp import gcp


def debug():
    import time

    print("debug now....")
    time.sleep(1800)
    return


debug_op = comp.func_to_container_op(debug, base_image="google/cloud-sdk")


@dsl.pipeline(name="Debug Pipeline")
def debug_pipeline(secret=dsl.PipelineParam(name="secret", value="user-gcp-sa")):
    debug_task = debug_op().apply(gcp.use_gcp_secret(secret))


if __name__ == "__main__":
    pipeline_func = debug_pipeline
    pipeline_filename = pipeline_func.__name__ + ".pipeline.zip"
    compiler.Compiler().compile(pipeline_func, pipeline_filename)
    print("compiled pipeline to", pipeline_filename)

    client = kfp.Client()
    experiment = client.get_experiment(experiment_name="mattbrown")
    run_name = pipeline_func.__name__ + " run"

    params = dict(arg.split("=") for arg in sys.argv[1:])

    run_result = client.run_pipeline(experiment.id, run_name, pipeline_filename, params)

since the compiler invokes the pipeline function with parameters that are kfp.dsl.PipelineParam instances, this first version fails with:

Traceback (most recent call last):
  File "pipeline.py", line 29, in <module>
    compiler.Compiler().compile(pipeline_func, pipeline_filename)
  File "/Users/mattbrown/.pyenv/versions/issue-278/lib/python3.6/site-packages/kfp/compiler/compiler.py", line 668, in compile
    workflow = self._compile(pipeline_func)
  File "/Users/mattbrown/.pyenv/versions/issue-278/lib/python3.6/site-packages/kfp/compiler/compiler.py", line 603, in _compile
    pipeline_func(*args_list)
  File "pipeline.py", line 23, in debug_pipeline
    debug_task = debug_op().apply(gcp.use_gcp_secret(secret))
  File "/Users/mattbrown/.pyenv/versions/issue-278/lib/python3.6/site-packages/kfp/gcp.py", line 36, in use_gcp_secret
    volume_name = 'gcp-credentials-' + secret_name
TypeError: must be str, not PipelineParam

if I change how the secret param name/value is referenced to be

-    debug_task = debug_op().apply(gcp.use_gcp_secret(secret))
+    debug_task = debug_op().apply(gcp.use_gcp_secret(secret.value))

this also fails at the compile stage since the compiler invokes the function with PipelineParams whose value are None:

Traceback (most recent call last):
  File "pipeline.py", line 29, in <module>
    compiler.Compiler().compile(pipeline_func, pipeline_filename)
  File "/Users/mattbrown/.pyenv/versions/issue-278/lib/python3.6/site-packages/kfp/compiler/compiler.py", line 668, in compile
    workflow = self._compile(pipeline_func)
  File "/Users/mattbrown/.pyenv/versions/issue-278/lib/python3.6/site-packages/kfp/compiler/compiler.py", line 603, in _compile
    pipeline_func(*args_list)
  File "pipeline.py", line 23, in debug_pipeline
    debug_task = debug_op().apply(gcp.use_gcp_secret(secret.value))
  File "/Users/mattbrown/.pyenv/versions/issue-278/lib/python3.6/site-packages/kfp/gcp.py", line 36, in use_gcp_secret
    volume_name = 'gcp-credentials-' + secret_name
TypeError: must be str, not NoneType

Finally, if I try

-    debug_task = debug_op().apply(gcp.use_gcp_secret(secret))
+    debug_task = debug_op().apply(gcp.use_gcp_secret(str(secret)))

then the pipeline compiles ok, but fails when it is run with:

This step is in Error state with this message: volume 'gcp-credentials-user-gcp-sa' not found in workflow spec

the generated Argo workflow looks like the following - the kfp Compiler seems to recognize that the secret name should be parameterized, but it seems as if Argo does not fully understand how to find a volume where the name is parameterized.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  creationTimestamp: "2019-09-11T19:03:09Z"
  generateName: debug-pipeline-
  generation: 2
  labels:
    workflows.argoproj.io/completed: "true"
    workflows.argoproj.io/phase: Error
  name: debug-pipeline-f4jvm
  namespace: kubeflow
  resourceVersion: "26292253"
  selfLink: /apis/argoproj.io/v1alpha1/namespaces/kubeflow/workflows/debug-pipeline-f4jvm
  uid: cbaf50dc-d4c6-11e9-84e1-42010a8401ce
spec:
  arguments:
    parameters:
    - name: secret
      value: user-gcp-sa
  entrypoint: debug-pipeline
  serviceAccountName: pipeline-runner
  templates:
  - container:
      command:
      - python3
      - -u
      - -c
      - |
        def debug():
            import time

            print("debug now....")
            time.sleep(1800)
            return

        import argparse
        _parser = argparse.ArgumentParser(prog='Debug', description='')
        _parsed_args = vars(_parser.parse_args())

        _outputs = debug(**_parsed_args)

        if not hasattr(_outputs, '__getitem__') or isinstance(_outputs, str):
            _outputs = [_outputs]

        from pathlib import Path
        for idx, filename in enumerate(_output_files):
            _output_path = Path(filename)
            _output_path.parent.mkdir(parents=True, exist_ok=True)
            _output_path.write_text(str(_outputs[idx]))
      env:
      - name: GOOGLE_APPLICATION_CREDENTIALS
        value: /secret/gcp-credentials/user-gcp-sa.json
      - name: CLOUDSDK_AUTH_CREDENTIAL_FILE_OVERRIDE
        value: /secret/gcp-credentials/user-gcp-sa.json
      image: google/cloud-sdk
      name: ""
      resources: {}
      volumeMounts:
      - mountPath: /secret/gcp-credentials
        name: gcp-credentials-{{inputs.parameters.secret}}
    inputs:
      parameters:
      - name: secret
    metadata: {}
    name: debug
    outputs:
      artifacts:
      - name: mlpipeline-ui-metadata
        path: /mlpipeline-ui-metadata.json
      - name: mlpipeline-metrics
        path: /mlpipeline-metrics.json
  - dag:
      tasks:
      - arguments:
          parameters:
          - name: secret
            value: '{{inputs.parameters.secret}}'
        name: debug
        template: debug
    inputs:
      parameters:
      - name: secret
    metadata: {}
    name: debug-pipeline
    outputs: {}
  volumes:
  - name: gcp-credentials-{{inputs.parameters.secret}}
    secret:
      secretName: '{{inputs.parameters.secret}}'
status:
  finishedAt: "2019-09-11T19:03:09Z"
  nodes:
    debug-pipeline-f4jvm:
      children:
      - debug-pipeline-f4jvm-1048857848
      displayName: debug-pipeline-f4jvm
      finishedAt: "2019-09-11T19:03:09Z"
      id: debug-pipeline-f4jvm
      inputs:
        parameters:
        - name: secret
          value: user-gcp-sa
      name: debug-pipeline-f4jvm
      phase: Error
      startedAt: "2019-09-11T19:03:09Z"
      templateName: debug-pipeline
      type: DAG
    debug-pipeline-f4jvm-1048857848:
      boundaryID: debug-pipeline-f4jvm
      displayName: debug
      finishedAt: "2019-09-11T19:03:09Z"
      id: debug-pipeline-f4jvm-1048857848
      inputs:
        parameters:
        - name: secret
          value: user-gcp-sa
      message: volume 'gcp-credentials-user-gcp-sa' not found in workflow spec
      name: debug-pipeline-f4jvm.debug
      phase: Error
      startedAt: "2019-09-11T19:03:09Z"
      templateName: debug
      type: Pod
  phase: Error
  startedAt: "2019-09-11T19:03:09Z"

What did you expect to happen:
To be able to parameterize the name of a secret (containing GCP service account credentials) to run a pipeline or component as.

Anything else you would like to add:
Tested with Kubeflow 0.5 and kfp python SDK 0.1.24.

@mattnworb
Copy link
Contributor Author

I also get the same results with the last variant

debug_task = debug_op().apply(gcp.use_gcp_secret(str(secret)))

with kfp SDK 0.1.29 and Kubeflow 0.6.2 (which I think has the pipeline components installed as 0.1.23, and argo 2.3.0).

@rmgogogo
Copy link
Contributor

Hi Matt,

Here is a sample for general purpose which I think you already know.
https://github.com/kubeflow/pipelines/blob/master/samples/core/secret/secret.py

May I know more info on why you handle over gcp_secret from PipelineParam?
Is it because you want to use different secret for different pipelines or differnt components in same pipeline? How many candidate secret in your case?

@rmgogogo rmgogogo assigned rmgogogo and unassigned james-jwu Nov 18, 2019
@rmgogogo rmgogogo added kind/feature status/triaged Whether the issue has been explicitly triaged area/backend area/sdk labels Nov 18, 2019
@rmgogogo
Copy link
Contributor

One more question: if you target for workload authorization isolation, would "Workload Identity" be better? You bind pipeline/component to a KSA which is bind to a GSA.

@Ark-kun
Copy link
Contributor

Ark-kun commented Nov 18, 2019

I also get the same results with the last variant with kfp SDK 0.1.29

Please try the latest SDK. This issue in Argo was inadvertently fixed by
#2229 in v0.1.32

The small issue still remains - use_gcp_secret uses secret name to build the volume name and that looks fragile when the secret name is not constant.

@mattnworb
Copy link
Contributor Author

@rmgogogo

May I know more info on why you handle over gcp_secret from PipelineParam?
Is it because you want to use different secret for different pipelines or differnt components in same pipeline?

The former - to run the same pipeline code with different configuration (secret, output paths, etc).

@Ark-kun
Copy link
Contributor

Ark-kun commented Dec 2, 2019

The former - to run the same pipeline code with different configuration (secret, output paths, etc).

I see. We're trying to make the usage of secrets unnecessary as we move towards workflow identity support where the access rights are determined by the permissions of the service account that's used for the pipeline run.

@stale
Copy link

stale bot commented Jun 25, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 25, 2020
@stale
Copy link

stale bot commented Jul 2, 2020

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

@stale stale bot closed this as completed Jul 2, 2020
@Ark-kun Ark-kun removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jul 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants