-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Volumes, Tasks and PipelineResources overlap #1272
Comments
One of the benefits of |
The above approach does everything with volumes. I imagine the Since execution is a DAG, the path of the volume is also known. Of course, mounting would be optional since not all tasks need to share their state. The only weird situations seems like when the DAG fans out and multiple This approach is less structured than |
Couple thoughts from my pov. I'm coming at this from a somewhat selfish point of view - I've written and run CI pipelines for a few different projects and so my reaction is based on some bad experiences with existing tools rather than necessarily from the perspective of someone writing a brand new platform on top of Tekton. I can see value in the idea of having easy generic storage passed between Tasks but I think there are considerable benefits to Tekton's declarative resources too. Anyway, here are my immediate thoughts:
|
I think this is a really strong point and the main concern with this sort of approach. I wonder if the pipeline step/process had already been going for a long time, would the failure at this particular section be intrinsic to the volume? Maybe in the same way, a Somewhat similar, if a contract needed to be fulfilled between Originally, I put some consideration to using
I was just looking to start a discussion to consider this approach since there any many issues and conversations about improving |
To the point about who benefits from this: Pipeline DistributorI think broadly speaking it makes the catalog more well defined since you don't need to create Platform BuilderInstead of having to fork the code base to add their own Tekton DeveloperInstead of having to deal with new I think other roles also get indirect benefits from the above. |
I haven't looked https://github.com/tektoncd/pipeline/pull/1184/files over extensively, but I suppose my thinking is that this is not just a particular |
I wonder if #1285 alleviates some of the same pain. A FileSet is essentially an "untyped" group of files to be output from one Task and input to another but a benefit I see there is that the paths of those files is declared as part of the resource. |
To the same point of |
@vtereso I'm having trouble imagining a world without PipelineResources. Can you give me an example pipeline (perhaps a version of this canonical example pipeline) that doesn't define any PipelineResources? How would a Task know what a previous Task produced, and how would the overall PipelineRun report on the new state of things updated during its run? |
There are many different ways this could be accomplished, but it goes without saying this isn't entirely without some compromise as mentioned previously by @sbwsg:
I had a demo on last week's community call where I showed a rough idea of using the
The same volume mount was used throughout. This is less than ideal since the claim is referenced within the Going more with the implicit flow, perhaps some sort of dry run option/UI could be provided to explicitly model out what will happen for a particular run. |
Hey @vtereso ! Thanks for starting this discussion and helping us to critically examine PipelineResources :D As you pointed out, our docs on PipelineResources focus completely on how they currently work and not really on why we think they're an important abstraction. I like @dlorenc 's definition from #1076 a lot more than the one in our docs right now:
We can take this further by defining what input and output resources mean:
The way we use
Actually, everything that @vtereso is describing is possible today - folks could use
So I think you're saying @vtereso that if you look for example at the goland build Task in the catalog and you see: resources:
- name: source
type: git
targetPath: src/${inputs.params.package} The user still has to understand how to instantiate the The biggest con that I see is that in the above example, the section that pulls from git will need to be repeated in every - name: pull-request-pull
image: github.com/tektoncd/pipeline/cmd/pullrequest-init
volumeMounts:
- mountPath: /workspace2
name: clone-dir
command:
- /ko-app/pullrequest-init
env:
- name: GITHUBTOKEN
valueFrom:
secretKeyRef:
name: github-secret
key: tekton_volumes_token.txt
args:
- -url
- ${inputs.params.pullRequestUrl}
- -path
- /workspace2
- -mode
- download With a |
@bobcatfish That was a good summary.
What do you mean by this? My rationale behind I don't think that volumes can be passed into Although it does not always apply, I generally like the idea of having one robust way to do things rather than multiple to achieve different use cases. If it is actually possible to do this already using Tekton in the same fully reusable form, I would find it strange that users probably don't know about this. I think your point about
seems very close to describing volumes and the like to me. If something like the |
@vtereso do you envisonned - name: Set up Go 1.13
uses: actions/setup-go@v1
with:
go-version: 1.13
- name: Check out source code
uses: actions/checkout@v1 On the "volume handling", currently you can do the following with apiVersion: tekton.dev/v1alpha1
kind: Task
metadata:
name: task-volume
spec:
steps:
- name: write
image: ubuntu
command: ["/bin/bash"]
args: ["-c", "echo some stuff > /im/a/custom/mount/path/file"]
- name: read
image: ubuntu
command: ["/bin/bash"]
args: ["-c", "cat /short/and/stout/file"]
stepTemplate:
volumeMounts:
- name: custom
mountPath: /short/and/stout
---
apiVersion: tekton.dev/v1alpha1
kind: TaskRun
metadata:
name: test-template-volume
spec:
taskRef:
name: task-volume
podTemplate:
volumes:
- name: custom
emptyDir: {} That allows you to decouple the mount point from the volume type. You could have all your task referencing a volume that is "attached"/"declared" in your One thing we do not yet tackle (and that we are not going to tackle with the current "Pipeline Resource Extensibility" design) is more generic input resource types – e.g. a
I generally agree with having one robust way to do things rather than multiple to achieve different use cases". But right now, doing things without This would mean :
apiVersion: tekton.dev/v1alpha1
kind: Step
metadata:
name: git-clone-with-params
image: ubuntu
command: ["/bin/bash"]
args: ["-c", "git clone $(inputs.params.repository) /workspace/src"] # the dest. could be a param too
---
apiVersion: tekton.dev/v1alpha1
kind: Task
metadata:
name: task-foo-bar
spec:
inputs:
params:
- name: repository
value: https://github.com/tektoncd/pipeline.git
steps:
- name: foo
uses: git-clone-with-params
- name: read
image: ubuntu
command: ["/bin/bash"]
args: ["-c", "cat /foo/bar/README.md"]
stepTemplate:
volumeMounts:
- name: custom
mountPath: /foo/bar One of the main question is then, what happens if the referred steps is not present ? Should we have a tighter integration with the catalog ? and how should we version those ? Is this the responsability of the pipeline controller itself or something above (like the operator) ? Also, as it's kinda going the opposite way of the idea of |
If a step/ As mentioned above, I believe the two value add of
A solution to extensibility regarding source providers (preventing the above duplication) could be to have one big task that handles pulling from different providers. Another idea as current would be to use interpolation to map to the correct |
In the current API,
However, it seems to me that to there's a missing concept here: It makes me feel that we probably could first agree on what a good "memory consistency model" for dynamic data between tasks is, then decide whether we need is just volumes or structured dynamic data between tasks. Personally I like the design of having |
Notes from our initial meeting on this subject are available here for review: https://docs.google.com/document/d/1p6HJt-QMvqegykkob9qlb7aWu-FMX7ogy7UPh9ICzaE/edit |
Latest design from @sbwsg to address confusion and bugs by re-designing PipelineResources: https://docs.google.com/document/d/1euQ_gDTe_dQcVeX4oypODGIQCAkUaMYQH5h7SaeFs44/edit# Note we also have #1076 which outlines similar problems and proposes a re-design. Since there hasn't been any activity on this issue and we've stopped having the PipelineResources meetings (for now!) I'm going to resolve this issue, we can keep discussing in #1076 or re-open this if needed. Thanks all! |
Background
Tekton
Pipelines
/Tasks
separate functionality from configuration by acting as reusable components for work. Work is actualized/instantiated in the correspondingPipelineRun
/TaskRun
objects (noted as Run objects onward for the sake of brevity). Run objects store the work configuration through bothPipelineResources
and parameters.Goals
PipelineResources
PipelineResources
Introduction
According to the documentation:
This does not not explain their purpose, but rather their current embodiment.
To the case of resuability, it makes a lot of sense to structure things in a functional way (e.g. f(x)=y) where
Pipelines
/Tasks
are the functions that ingest configuration. In contrast, it seems thatPipelineResources
seem to fall somewhere in-between.PipelineResource Role
Although each kind of
PipelineResource
does something different, I believe it is reasonable to consider them as syntactically sweet mounts. In general, mounts are really helpful because they allow some foreign information to be attached to a pod/container.Without mounts, there would only be two options:
In actuality,
PipelineResources
append steps to tasks and handle volumes. SinceTasks
are supposed to be reusable (e.g. the catalog), it seems strange to add more carpentry to manipulate their definition. More strange, eachPipelineResource
does something unique rather than being a normalized operation, which also lends itself to an extensibility problem.PipelineResources
would likely be more clear if their responsilities were divided between appending steps and handling volumes. It seems like the orchestration of the volume is the important piece, where the step appending is somewhat less so.PipelineResources
can be reorganized asTasks
that could be composed together.Let's look at a few different
PipelineResources
:Git Resource
Pull Request Resource
In just these two use cases (although there are more),
PipelineResources
do help simplify mounting. However, there is a bit of quirkiness betweenPipelineResources
. Some resources likeGitResources
seem to only be input resources, whilePullRequestResources
are both, but likely never as inputs to otherTasks
. In any case, they could very operate asTasks
rather than being shipped around betweenTasks
. This sort of behavior is outlined here.All of the currently supported
PipelineResources
can be seen here, which is likely to grow especially with consideration for integrations with the notifications proposal. I think it's important to make a distinction on the overlap of responsibilities betweenVolumes
,Tasks
, andPipelineResources
before we invest further.PipelineResources Problems
As mentioned in the background section, Run objects take parameters and
PipelineResources
.Since these are distinct objects,
PipelineResources
need to be created ahead of time (separately), which is an inconvenience.As a byproduct, there is currently a proposal to allow for
PipelineResource
embedding into thePipelineRun
(although it remains a distinct k8s resource) to address this as well as resource littering. The aforementioned extensability problem is also a concern. Further,PipelineResources
can be tampered with/deleted. This introduces the cookie-cutter/templating problem, where Run objects utilizingPipelineResources
(at least as current) cannot determine whether they have been modified during or between runs.Potential Solution
Add some logic into the Tekton API to declaratively handle volumes to facilitate data between steps/
Tasks
in a reusable way.There are multiple ways to do this, but ultimately this gets as making Tekton much more simple in a few different ways:
This also has me wonder: If images and other fields coulds be overrided by Runs, would interpolation be necessary at all? With no interpolation (maybe this is a stretch), tools like Kustomize (not that I am familiar with it) or the sort would be able to be the engine to edit resources rather than it being done internally.
The text was updated successfully, but these errors were encountered: