-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconcilers should rely on knative/pkg to Update TaskRuns and PipelineRuns #5146
Comments
Possibly related: I think at least part of the reason we update directly is so that when we publish the cloud event we can be sure the object matches the published state -- if updating fails, we shouldn't publish a different cloud event. If cloud event publishing was a separate offline process that happened in response to the update, we'd have the same guarantee that the cloud event state matches the updated state, and we'd have the benefits of a separate configurable/monitorable/scaleable deployment just for publishing events, and we could have reconciliation updates managed by knative/pkg. Seems like a win-win-win, we just need to write some code. |
I'm pretty sure the call to About cloud events, moving publishing to a separate controller as pros and cons, but I think the pros definitely outweighs the cons. The main cons are:
The external controller exists in experimental and with @waveywaves we drafted a roadmap to make it replace the current built-in support, but we then both ran out of bandwidth. I will try and pick it back up this summer. |
Issues go stale after 90d of inactivity. /lifecycle stale Send feedback to tektoncd/plumbing. |
/lifecycle frozen I think this is still something we should do. I think @vdemeester was looking into it. |
Yes I was looking into it recently, and it might "break" some assumption users do (on some labels being present on |
/assign |
/unassign |
Background
The TaskRun and PipelineRun controllers are invoked by Knative controller code that calls
ReconcileKind(ctx, tr)
with the result of a K8s watch on resources, then determines if those resources need to be updated, and if so, callsUpdate
on them.This is good, because Knative code handles the watching and updating for us, and only has to assume that any changes Tekton wants to make to resources is expressed in terms of updates to the
tr
object passed toReconcileKind
.knative/pkg expects this:
However, our
ReconcileKind
s for TaskRuns and PipelineRuns (notably, not Runs) also make their own callsUpdate
, as part ofupdateLabelsAndAnnotations
:pipeline/pkg/reconciler/pipelinerun/pipelinerun.go
Lines 1171 to 1186 in 8a7b0cf
pipeline/pkg/reconciler/taskrun/taskrun.go
Lines 548 to 573 in 8a7b0cf
Effectively, what we're doing is:
This leads to an unnecessary and unexpected change in the control of the update lifecycle for both PipelineRuns and TaskRuns. A method unsuspiciously called
updateLabelsAndAnnotations
is actually responsible for persisting all the changes we make during a reconciliation.This isn't a problem per se, it's just unexpected, and not how Knative reconciler code expects to be used, which may cause confusion and bugs later. It may also lead to duplicate calls to
Update
, if diffs sneak in after our call toupdateLabelsAndAnnotations
, which can lead to cluster-destabilizing write load on the K8s API server in heavy users of Tekton.This may cause extra problems if Knative controller logic starts making load-bearing assumptions that it's solely responsible for the update lifecycle of objects it passes to
ReconcileKind
, which could cause headaches for us later. knative/pkg thankfully has downstream tests to warn about behavior changes that might break Tekton, but this still means they may be prevented from making otherwise helpful changes and optimizations because of how Tekton is (mis-)using its packages.Expected Behavior
I'd expect
ReconcileKind
not to make calls toUpdate
itself, and rely on Knative's ownUpdate
s.Additional Info
While we're at it,
updateLabelsAndAnnotations
is called by another unsuspicious-sounding method,finishReconcileUpdateEmitEvents
, which should probably also be refactored so it's less load-bearing in the overall reconciliation cycle:pipeline/pkg/reconciler/taskrun/taskrun.go
Line 281 in 8a7b0cf
@vdemeester @afrittoli WDYT?
The text was updated successfully, but these errors were encountered: