-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TEP ~ Automatically manage concurrent PipelineRuns #716
TEP ~ Automatically manage concurrent PipelineRuns #716
Conversation
…pipelines marked for non-concurency
/assign @PuneetPunamiya |
I think this has some similarities with https://hackmd.io/GK_1_6DWTvSiVHBL6umqDA (that I have yet to propose as a TEP 😅 ) |
@vdemeester I like a lot if what you have in that doc! You mention using annotations/cm, and that feels a lot more flexible. This is also a feature I'm very interested in seeing progress on 😸, so in an effort to continue the conversation, and perhaps meld the TEPs into one, should we continue discussing details in that md file you posted? or is here more appropriate? Thanks! |
Yes we should definitely merge the two into one TEP and collaborate on it. I think we can try to merge the content of this and the hackmd in the hackmd, and once we feel it's kind-of ready, we update this pull-request. How does it sound ? (feel free to ping me on slack as well 😉 ) |
@vdemeester sounds good to me! ima mark this as a WIP, and will begin to leave thoughts on the md |
/kind tep |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@vdemeester updated! 🎈 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @williamlfish this is a great idea!
|
||
## Motivation | ||
There are situations where pipelines can overlap, and they should not. For instance, a pr is created and that triggers ci to so a PipelineRun is created, a developer notices a small typo, pushed again, and a new PipelineRun is triggered, the first PipelineRun is no longer relevant, but potentially still running. | ||
Another example is managing available shared resources, lets say tekton triggers an external device and there are a set available amount, or simply resource constrained clusters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would imagine there are also some CD related use cases here: maybe you want to build an image and tag as "latest", or deploy some infra to a cluster, and you want to make sure the most recently created pipelinerun is also the one that is applied last.
- All PipelineRuns follow fifo patterns. | ||
|
||
## Proposal | ||
Adding a key on the PipelineRun manifest seems like the most appropriate, and simple solution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it could make sense to have any of this defined on the Pipeline instead. for example, in the case where you want only one CI run at a time:
kind: Pipeline
spec:
concurrency:
zeroOverlap:
key: $(params.pull-request-id)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah this is interesting, like having the idea of overlap closer to the actual thing you don't want overlapped ( for lack of better phrasing lolol 🙈 )? There is something that feels nice and clean about that, but I wonder if in practice it will be hard for users to remember where one config goes vs another?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah! The reason I suggest this is because the Pipeline author knows whether concurrency is appropriate for the pipeline, and this will allow users to not have to set concurrency controls on every pipelinerun for that pipeline. I'm sure we could explore adding this to the PipelineRun later if there's a clear need. What do you mean by it being hard to remember where one config goes vs another?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All I meant on the topic of remembering where configs might go, if the zeroOverlap is on the pipeline, but the queue config is on the PipelineRun, then there are 2 areas that are managing a similar topic ( concurrency ). That being said, I dont think it is a hard stop or anything, just a thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could probably add both options to the Pipeline spec for now (like in the alternative syntax example I put below) and then we could always add concurrency
to PipelineRun later if needed. Do you think the "queue" option is something you would want to specify on the PipelineRun?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there is a debate of "is concurrency an authoring or a runtime concern ?" 🙃.
The reason I suggest this is because the Pipeline author knows whether concurrency is appropriate for the pipeline, and this will allow users to not have to set concurrency controls on every pipelinerun for that pipeline.
I partly disagree on this. It might be true for some usecase, but I am not sure this would be the majority, at all.
If I have a Go Pipeline
(build, test, publish an image or something), as an author, I do not need nor want to set concurrency on this, as I don't really have to care about it. But as a consumer of this Pipeline, I may want to have only 1 PipelineRun for this Pipeline
to run per Go project.
In addition, putting this on the Pipeline
spec might reduce Pipeline
"share-ability". Taking that Go Pipeline
again, and assuming it's available in a catalog (or better, through the hub) ; if I want to achieve what I described above, I actually need to "duplicate" the Pipeline
or augment it using, for example, Kustomize.
My take on this is almost the opposite as @lbernick 👼🏼 🙃. I'd rather start by having this on the PipelineRun
spec as it's closer to a "runtime" concern than an authoring one. And then I would explore the in Pipeline
. Not having it on PipelineRun
also reduce a bit it's "potential", because then any tool (or user) using embedded specs (like pipelines-as-code
) will not benefit from it at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this a bit more today-- what would we do if two PipelineRuns had the same concurrency key but different concurrency strategies? I think the concurrency strategy has to be defined somewhere other than the PipelineRun-- either on the Pipeline or in a configmap or new CRD. I do see your point though about how different users of a given pipeline might want to configure concurrency differently, which makes me lean more towards cluster-level configuration (like in your original hackmd).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we thinking too much about this though ?
Who are we targeting for this feature ? Do we assume the majority of the usage of this features will be tools (pipeline-as-code, team's tooling, …) or end-users ? Can this be documented as an actual "known limitation" ? Can we make those keys unique maybe (compute and append a hash on the referenced pipeline or embedded pipeline definition, …) ?
Cluster-level configuration is a bit tricky. It works for some cases, mainly, for cluster-admins to limit the number of "resources" consumed by tekton, but teams on their own namespaces will probably need to handle those per namespace or "project" or pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To detail this a bit more:
- on the "making things unique", the controller could take the user's provided concurrency-key, but when "computing" it, also "hashing" the pipelinespec the pipelinerun refers to (be it through pipelineref or embedded spec). That way, it wouldn't "conflict" if 2 PR refers to different pipeline (or an update on a pipeline) but they same concurrency key. This comes with its own set of challenges/problems but we could keep that approach in mind while experimenting.
- the "main" assumption I am making here though, is that in most of the cases, the users doesn't create the PipelineRun himself (aka authoring it), tools do it for them (clients like tkn, tools like pipelines-as-code, home-grown tool, ibm stuff, …). And in the small amount of cases where the user author them, it's probably fine to have this known limitation/issue.
In a gist, I really do think, while experimenting, while it's in alpha & all, we can not overthink this, and see with "usage" if it really is a problem or not.
|
||
Short example: | ||
```yaml | ||
concurrency: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potential alternative syntax:
concurrency:
key: $(params.foo)
strategy: cancel # options here are "queue" and "cancel"
max: 1 # we can choose a default value for max depending on the concurrency strategy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This alternative syntax does not give the ability to use both a queue and zeroOverlap at the same time right? or am i misunderstanding part of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't. Imagine there are 5 PipelineRuns running, with max = 5. My understanding would be that if a new PipelineRun was created and the strategy was "queue", it would wait until one finished before starting. If the strategy was "cancel", it would cancel the earliest created PipelineRun and then start. ZeroOverlap is "cancel" with max = 1.
I'm not understanding why you might want to use both at the same time; they seem incompatible with each other. Can you describe what you had in mind?
Another example is managing available shared resources, lets say tekton triggers an external device and there are a set available amount, or simply resource constrained clusters. | ||
|
||
|
||
2 proposed cancel strategies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a formatting note, I think it would be helpful to separate out use cases from proposed strategies-- that makes it a bit easier to separate design from motivation and make sure we have a strategy for addressing each use case; up to you though!
/assign @lbernick |
@williamlfish and I are now working together in this doc @vdemeester sorry to create another doc to merge! this doc is going in a bit of a different direction from the hackmd so I think it is useful to have both. would love your input here |
@williamlfish: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I think this is a pretty meaty feature and would benefit from us agreeing on the goals of this TEP before moving to the proposal. @williamlfish is on board with me opening an initial PR for the problem statement (chatted offline) so I created #791 |
API WG - @lbernick to check with @williamlfish if we can close this one |
@pritidesai @vdemeester @lbernick im okay with this being closed. would love to be tagged in future efforts on this tho 😄 |
thank you @williamlfish 👍 thank you @lbernick for adding @williamlfish as a TEP author. Please also add @williamlfish as a co-author to the commit and PR description:
|
@williamlfish We need to reconcile the content of this with the tep-120. So it's gonna be more that just tagged in future efforts, we'll call you 😬 |
Thanks for the reminder @pritidesai! I don't want to amend the commit history now that it's already merged, but I'll update the PR description and will add that next time on the commit message. |
Issues go stale after 90d of inactivity. /lifecycle stale Send feedback to tektoncd/plumbing. |
TEP outlining a feature that would give developers the option to have the PipelineRuns that potentially overlap an ability to only run the most recent one. Please feel free to let me know if any other information outlining the idea is needed or if I can be helpful in clarifying.
I would also love any input on what to call this? Describing the situation more succinctly would be ideal and am open to any recommendations 😅