-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TaskRun fails during initialization when disable-home-env-overwrite=true #2165
Comments
I tested this after defining |
Note that this error is from the But then, we will have to answer this question: what if I define multiple steps whose images have different user home directories (say, the home of image A in the first step is |
I think relying on creds-init might actually be the wrong approach here. You're right that multiple home directories in multiple steps make things complicated. You might consider passing a To me, it seems like creds-init is simply not capable of injecting credentials in a reliable way for this use case and I'm not sure that copying these creds everywhere that they might be used is the best way to go. I'm still thinking about possible solutions but something about the way it works currently doesn't feel right to me. |
Thanks. I'll look into mounting a Secret, but I'm not sure if this will work nicely as a tekton catalog writer (jib-gradle and jib-maven). This sounds like my task will add a special contract only applicable for my catalog for providing Docker credentials. I'd like a general solution in a document way at the Tekton level, but I'll look into mounting a Secret anyways. In any case, this error is blocking me from testing |
BTW, isn't this the current behavior? The Docker credentials are exposed to every step at |
I meant more the copying of creds to every possible HOME directory that steps of a task might use. Having a single root-owned location seems different to me than having 10 copies with random owners. Maybe it's a distinction that doesn't matter after all, I dunno. To me this feels like there should be an explicit "opt-in" - Step A declares that it will use creds X so we do copy X into that step's |
Yeah that's a fair point and I understand not wanting to take this path if this isn't an approach that everyone uses. I'm wondering whether this should become a recommendation for catalog authors though - to expose (optional) workspaces for credentials to be mounted into. If everyone was doing it then it might not be bad? |
I can do this (as I commented in #2119 (comment)), and I will do this if this is the only option. However, I'd like to throw a couple points for the sake of discussions. In my case, the Docker config file is just one of many ways to get credentials for remote Docker registries. Taking the optional workspaces approach means having to declare 5+ optional workspaces in my Task. The Task needs to clone a git repo, pull from/push to remote Docker registries over HTTPS, and use docker credential (of type Now, when you flip How about this: the current behavior is to create these credential files under |
Also note, I think copying a file into a "home" directory needs more thought as I explained in #2013 (comment). For example, OpenShift runs containers as a random UID like |
I think "clutter" is a mis-characterization. I view this as a Task Author explicitly declaring support for, and dependence on, credentials. It becomes obvious what the Task accepts and expects. With So in other words I feel like the burden should be on catalog Task authors to explicitly declare what creds they support in the Tasks they write. In my view this shouldn't be something that happens quietly behind the scenes. There may be much better ways for Tekton to support authors doing this but I definitely think explicit is better than implicit here. So having written all this, creds-init as it stands today should 100% work as it's documented to. I'm still debugging the issue with the HOME var mentioned at the top of this post and will update as I figure out what's up or if I have more questions. |
Thanks. If Tekton recommends this approach, I am all for it. I hope the optional workspace feature is implemented soon. For now, I think I'm blocked on the issue to implement this approach.
Anyways, I'm still super curious how exactly it will work when |
I've gone with the approach of placing the creds in a fixed location ( |
Design doc for this problem to be discussed in WG on wednesday: https://docs.google.com/document/d/1SVuDt-SXPHymz41dveSXFSPrx5Z-Wzb9eHliJAyYg4o |
Just to reiterate from the Pull Request that closed this Issue:
@chanseokoh once v0.11.0-rc3 is released this fix will be available to try out. Very keen to hear your feedback / experience with the changes! |
Pipelines Beta RC3 has been released and includes this fix https://github.com/tektoncd/pipeline/releases/tag/v0.11.0-rc3 |
I'm testing this now and also the case where I run a pipeline as a non-root while forcing env:
- name: HOME
value: /workspace
volumeMounts:
- name: $(inputs.params.CACHE)
mountPath: /workspace/.gradle/caches
subPath: gradle-caches
- name: $(inputs.params.CACHE)
mountPath: /workspace/.gradle/wrapper
subPath: gradle-wrapper And
What's the right solution to this? (Assume that the k8s runtime will assign an arbitrary random user like in OpenShift.) |
FYI, I was setting Now, if I set the following in a TaskRun spec, securityContext:
runAsUser: 1111
runAsGroup: 2222
fsGroup: 3333 I get the following errors when
BTW, without the
|
Sorry, I need to correct this. It is the "non-existing parent directories of the volumes". For example, I am mounting |
Volume mounts can be nested, so /workspace/.gradle can be a volumeMount even while /workspace/.gradle/cache and /workspace/.gradle/wrapper are as well. This results in all three directories being world-writable. For the "unsuccessful cred copy" messages, can you confirm whether $HOME was set in your Step's env? If so I'm baffled how the cred copy didn't figure out what HOME was and why it decided to write to The |
The problem is that I don't want to cache
This error just prevents my task from running at all, so I think there's even no opportunity of using |
And sorry for asking a bit unrelated question: my task gets a git project source at |
Let's take this one complete example at a time. It's difficult to know where to start otherwise. I've set both Here's the complete TaskRun and TaskSpec which I'm running. It uses two securityContexts, one in the TaskRun's apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
generateName: test-creds-
spec:
podTemplate:
securityContext:
runAsUser: 1111
runAsGroup: 2222
fsGroup: 3333
taskSpec:
steps:
- name: check-dirs
image: ubuntu
env:
- name: HOME
value: /workspace
script: |
#!/usr/bin/env bash
set -xe
id
ls -lahR $(credentials.path)
echo ~
echo $HOME
ls -lahR $HOME
securityContext:
runAsUser: 1234 Here're the relevant log lines I see in the when I run this TaskRun: [check-dirs] 2020/04/29 19:41:55 unsuccessful cred copy: ".docker" from "/tekton/creds" to "/workspace": unable to open source: open /tekton/creds/.docker/config.json: permission denied
[check-dirs] 2020/04/29 19:41:55 unsuccessful cred copy: ".gitconfig" from "/tekton/creds" to "/workspace": unable to open source: open /tekton/creds/.gitconfig: permission denied
[check-dirs] 2020/04/29 19:41:55 unsuccessful cred copy: ".git-credentials" from "/tekton/creds" to "/workspace": unable to open source: open /tekton/creds/.git-credentials: permission denied
[check-dirs] 2020/04/29 19:41:55 unsuccessful cred copy: ".ssh" from "/tekton/creds" to "/workspace": unable to open source: open /tekton/creds/.ssh/known_hosts: permission denied The only one of these that I'm actually interested in (because my ServiceAccount / Secret specify it) is this line: [check-dirs] 2020/04/29 19:41:55 unsuccessful cred copy: ".ssh" from "/tekton/creds" to "/workspace": unable to open source: open /tekton/creds/.ssh/known_hosts: permission denied So what's happening here? Looking at the directory structure of [check-dirs] /tekton/creds:
[check-dirs] total 8.0K
[check-dirs] drwxrwsrwt 4 root 3333 120 Apr 29 19:41 .
[check-dirs] drwxr-xr-x 8 root root 4.0K Apr 29 19:41 ..
[check-dirs] drwxr-sr-x 2 1111 3333 60 Apr 29 19:41 .docker
[check-dirs] -rw------- 1 1111 3333 0 Apr 29 19:41 .git-credentials
[check-dirs] -rw------- 1 1111 3333 29 Apr 29 19:41 .gitconfig
[check-dirs] drwxr-sr-x 2 1111 3333 100 Apr 29 19:41 .ssh I'm only interested in the Looking at the directory structure of [check-dirs] /tekton/creds/.ssh:
[check-dirs] total 12K
[check-dirs] drwxr-sr-x 2 1111 3333 100 Apr 29 19:41 .
[check-dirs] drwxrwsrwt 4 root 3333 120 Apr 29 19:41 ..
[check-dirs] -rw------- 1 1111 3333 110 Apr 29 19:41 config
[check-dirs] -rw------- 1 1111 3333 23 Apr 29 19:41 id_fake-ssh-directory
[check-dirs] -rw------- 1 1111 3333 28 Apr 29 19:41 known_hosts Hm, OK so this explains why the error message So I go check the docs. Reading there it sounds like written files will be owned by the user specified in the securityContext's Right, we've got a pretty good idea of what's happening now. What's a good solution? At the moment our creds-init helper writes files with permission Right now I'm thinking maybe we should drop creds-init completely. Every Step could get the Secrets as volume mounts and do the same work that creds-init does but in the context of their own isolated securityContext, writing to their own Also, if possible, could you post a complete Task + TaskRun with as few Steps / other stuff in it as possible, which reproduces the issue you're seeing. It'll be much easier to provide help if we have a single complete example to work from. |
Thanks a lot for looking into this! I couldn't agree more on what you said. And I am also not sure what the best solution should be. I get the exact same result of yours with your Task + TaskRun. So I compared the difference with my Task and was able to come up with a sample that reproduces my error. Hope this helps. apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
- generateName: test-creds-
+ name: test-creds
spec:
+ serviceAccountName: registry-admin
podTemplate:
securityContext:
runAsUser: 1111
runAsGroup: 2222
fsGroup: 3333
+ resources:
+ inputs:
+ - name: source
+ resourceSpec:
+ type: git
+ params:
+ - name: url
+ value: https://github.com/che-samples/console-java-simple
taskSpec:
+ resources:
+ inputs:
+ - name: source
+ type: git
steps:
- name: check-dirs
image: ubuntu The pipeline halts with this error:
What's interesting is that, as I said, if I comment out # podTemplate:
# securityContext:
# runAsUser: 1111
# runAsGroup: 2222
# fsGroup: 3333 UPDATE: Interestingly, if I remove
|
Excellent, I've been able to reproduce the problem exactly.
Ultimately the errors with these two Steps are happening because PipelineResources don't have a HOME set and they're trying to write to So summarizing the various problems that have been discovered here:
And the likely solutions seem to me:
Ideally the I'll create issues for each of these problems and then start working on fixes for both. |
Awesome! I can see what's going on.
One thing that still seems strange to me: as I said, when not setting
(If you also remove So, without |
One small nit here: we don't set the This is all immeasurably confusing. I'm going to try to illustrate the different scenarios here: First scenario: disable-home-override: "true" Order of operations:
Second scenario: disable-home-override: "true" Order of operations:
Third scenario: disable-home-override: "true" Order of operations:
Here This is a really confusing dance and there's quite a bit of work to do to get all of Tekton's movements in lock-step. |
@chanseokoh I think this would be worth checking in on again now that 0.14 has been released. Some of the warnings may remain in the logs (e.g. There's still some trickiness with the meaning of "$HOME" though. If a container is run with a randomized UID then that user isn't going to have a $HOME directory, since they won't have an entry in |
@sbwsg thanks for letting me know. I've tested 0.14 against GKE with However, I've noticed a recent change in the jib-gradle catalog task that replaced git input resource with a "workspace" source, which forced me to try
full log:
|
I also wonder if I have to use |
This is an example of where
Yeah, right now PipelineResources are getting a heavy amount of attention / redesign so pipelines + workspaces are preferred for this kind of "fetch-before-use" behaviour. There's nothing stopping you using a jib-gradle Task with PipelineResources but I guess @vdemeester thought it better to put the beta version of the catalog more in line with the beta APIs that Tekton exposes (which doesn't include PipelineResources - they're still alpha). |
Thanks for the explanation. The git behavior seems fair. I guess the UID 1111 example isn't something we will see in practice and probably the image should have a home for it. I think then this is basically working in my case. One last question: Gradle auto-creates |
Interesting that Regardless, I think mounting the writable emptyDir to |
Hello, "image-digest-exporter-d6kxs] 2020/08/21 08:55:40 unsuccessful cred copy: ".docker" from "/tekton/creds" to "/tekton/home": unable to open destination: open /tekton/home/.docker/config.json: permission denied" $ tkn version |
Same problem as @vyom-soft tkn version |
It would be great to see the Task ( Please make sure to sanitize / remove any data that is sensitive before posting them here. |
This is closely related to the on-going Tekton
$HOME
issue (#2013 (comment)). I am testingdisable-home-env-overwrite
before it gets flipped.This comment says
I don't think this is the case. I am testing
gcr.io/cloud-builders/gradle
, but Tekton fails as it tries to create a directory/.docker
.Note the "permission denied" error is not the issue here. The issue is that it is
/.docker
instead of/root/.docker
.The text was updated successfully, but these errors were encountered: