Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not using secret when pulling from private Registry #695

Closed
rummens opened this issue Jan 17, 2019 · 14 comments
Closed

Not using secret when pulling from private Registry #695

rummens opened this issue Jan 17, 2019 · 14 comments
Assignees

Comments

@rummens
Copy link

rummens commented Jan 17, 2019

Hello,

I am trying out the new Pipeline Features but I cannot get a Pipeline working pulling containers from a private (Azure) Registry. I tried pulling everything localy and it works. I created a secret using these instructions https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/

I also added the secret to the yaml of the pipeline (also is there a way to this using the compiling tool?):

- container:
      args:
      - --project
      - '{{inputs.parameters.project}}'
      - --mode
      - cloud
      - --bucket
      - '{{inputs.parameters.bucket}}'
      - --start_year
      - '{{inputs.parameters.startyear}}'
      image: kubtestregistry.azurecr.io/babyweight:v1
      imagePullSecrets:
      - name: regcred
    inputs:
      parameters:
      - name: bucket
      - name: project
      - name: startyear

But still the pipeline fails with an authentication error, here are the last events of the corresponding pod:

Events:
  Type     Reason     Age                From                               Message
  ----     ------     ----               ----                               -------
  Normal   Scheduled  77s                default-scheduler                  Successfully assigned kubeflow/babyweight-2lnns-1712976165 to aks-agentpool-34799199-0
  Normal   Pulling    74s                kubelet, aks-agentpool-34799199-0  pulling image "argoproj/argoexec:v2.2.0"
  Normal   Pulled     65s                kubelet, aks-agentpool-34799199-0  Successfully pulled image "argoproj/argoexec:v2.2.0"
  Normal   Started    60s                kubelet, aks-agentpool-34799199-0  Started container
  Normal   Created    60s                kubelet, aks-agentpool-34799199-0  Created container
  Warning  Failed     31s (x3 over 75s)  kubelet, aks-agentpool-34799199-0  Failed to pull image "kubtestregistry.azurecr.io/babyweight:v1": rpc error: code = Unknown desc = Error response from daemon: Get https://kubtestregistry.azurecr.io/v2/babyweight/manifests/v1: unauthorized: authentication required
  Warning  Failed     31s (x3 over 75s)  kubelet, aks-agentpool-34799199-0  Error: ErrImagePull
  Normal   Pulling    31s (x3 over 76s)  kubelet, aks-agentpool-34799199-0  pulling image "kubtestregistry.azurecr.io/babyweight:v1"
  Normal   BackOff    7s (x4 over 58s)   kubelet, aks-agentpool-34799199-0  Back-off pulling image "kubtestregistry.azurecr.io/babyweight:v1"
  Warning  Failed     7s (x4 over 58s)   kubelet, aks-agentpool-34799199-0  Error: ImagePullBackOff

Thanks very much.
Marcel

@gaoning777
Copy link
Contributor

The following steps should be followed to grant the permissions:
1.) add a kubernetes secret to the GKE.
2.) in DSL codes, add .apply(gcp.use_gcp_secret(SECRET_NAME)) to add the secret to the component that requires the permission.
3.) recompiler and submit.
Did you do the step two of updating the DSL codes?

@rummens
Copy link
Author

rummens commented Jan 18, 2019

I am not on GKE, but everything is on Azure (Company Policy). I added the secret to the Cluster

kubectl get secrets -n kubeflow
NAME                                        TYPE                                  DATA   AGE
[...]
regcred                                     kubernetes.io/dockerconfigjson        1      21h
[...]

I cannot use the gcp tool, since I am on Azure, right?

Also where do I have to add the .apply() command, in the decorator, the ContainerOp or during Compiling? This is my current "test" pipeline-component:

import kfp.dsl as dsl


class ObjectDict(dict):
    def __getattr__(self, name):
        if name in self:
            return self[name]
        else:
            raise AttributeError("No such attribute: " + name)


@dsl.pipeline(
    name='babyweight',
    description='Train Babyweight model'
)
def train_and_deploy(
        project='cloud-training-demos',
        bucket='cloud-training-demos-ml',
        startYear='2000'
):
    """Pipeline to train babyweight model"""
    start_step = 1

    # Step 1: create training dataset using Apache Beam on Cloud Dataflow
    preprocess = dsl.ContainerOp(
        name='preprocess',
        # image needs to be a compile-time string
        image='rkubtestregistry.azurecr.io/babyweight:v1',
        arguments=[
            '--project', project,
            '--mode', 'cloud',
            '--bucket', bucket,
            '--start_year', startYear
        ],
        file_outputs={'bucket': '/output.txt'}
    )


if __name__ == '__main__':
    import kfp.compiler as compiler

    filename = "babyweight_test.tar.gz"
    compiler.Compiler().compile(train_and_deploy, filename)

I added the name of the secret directly in the yaml after compiling and just repackaged the tar.gz and reuploaded. But still I receive an authenticate error. I am not on master branch but release 4.0, if this is an issue?

Thanks
Kind Regards
Marcel

@gaoning777
Copy link
Contributor

gaoning777 commented Jan 18, 2019

My mistake about using use_gcp_secrets(), which is to grant permissions for gcloud commands inside the container.
Did you follow the yaml spec in this example?
specifically:

imagePullSecrets:

  • name: regcred

@gaoning777 gaoning777 self-assigned this Jan 18, 2019
@hongye-sun
Copy link
Contributor

Marcel, kfp uses the same way that k8s uses to pull the image. Could you try to create a pod in your cluster by using the imagePullSecrets directly? It can make sure that the secret is configured correctly.

@rummens
Copy link
Author

rummens commented Jan 19, 2019

I added the imagePullSecrets under the container parameter, not the workflow param. I will test on Monday and report.

Also is there a way or will there be a way to specify this inside the pipeline code for non GKE users?

Thanks

@rummens
Copy link
Author

rummens commented Jan 21, 2019

@gaoning777 Thank you, this way works. I would add it to the offical documentation on the kubeflow website.

Also is there a way to do this in Python code, like you suggested earlier but on a non GKE cluster? Right now I have to compile the pipeline, extract the archive, change the yaml and build the archive again? Thats a little bit overhead :-)

Thanks

@gaoning777
Copy link
Contributor

We are trying to offer a product that is kubernetes native(platform independent).
The plan is to incrementally make it more friendly in other platforms, such as the the imagepullsecret you requested. Due to many other features occupying our engineering time, we will try to add these fixes/features based on the priority. However, contributions from the community are mostly welcome.
Thanks

@rummens
Copy link
Author

rummens commented Jan 23, 2019

I had a quick look into everything and came up with the following ideas:

For all pipelines

The easiest way to integrate a secret to a pod is to add it to the corresponding service account (in our case pipeline-runner), as can been seen here: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#add-imagepullsecrets-to-a-service-account
The command would be like this:

kubectl patch serviceaccount pipeline-runner -p '{"imagePullSecrets": [{"name": "name_of_my_secret"}]}'

This however allows every pipeline (every pod created by pipeline-runner) in the cluster to access the secret, it is therefore a slight security concern or isn't it?

For all components of one pipeline

Therefore the compiler.py of the Pipeline SDK has to be changed. The user has to give the Compiler the name of the secret and it has to be add to the basic workflow yaml in the function _create_pipeline_workflow (line 465 of compiler.py).
Happy to do it, if someone tells me the process of making changes to this codebase.

This would however expose the secret to all components in the pipeline, I think this is a good start but still what if someone wants to use different private registries for each component? This would not work using this approach, but as far as I am aware Argo does not support this (yet)? Also we should discuss if this is neccessary at all.

For a single component of one pipeline

As mentioned above, is this even neccessary? If so the secret has to be added to each container entry of a template and, as far as my testing goes, this doesn't work :-/

@gaoning777
Copy link
Contributor

Argo only supports workflow level imagepullsecret, which makes the pipeline impossible to support component level imagepullsecret.

@gaoning777
Copy link
Contributor

supporting pipeline level imagepullsecret in #745

@rummens
Copy link
Author

rummens commented Jan 29, 2019

Awesome thanks, I think Pipeline level secret is enough for most use cases. I will test once the PR is accepted.

@rummens rummens closed this as completed Jan 29, 2019
@gaoning777
Copy link
Contributor

PR merged. But wait for the next release that will include the feature.
Thanks

@otaviocv
Copy link

I am having the same problem but I cant figure out, neither in the examples or docs, where I can set pipelines configs, could you provide an example?

@rummens
Copy link
Author

rummens commented Jun 15, 2019

The easiest solution is to assign the secret to the pipeline runner service account.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants