Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📖 Add a design for a priority queue #3013

Merged
merged 1 commit into from
Dec 6, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 110 additions & 0 deletions designs/priorityqueue.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
Priority Queue
===================

This document describes the motivation behind implementing a priority queue
in controller-runtime and its design details.

## Motivation

1. Controllers reconcile all objects during startup to account for changes in
the reconciliation logic. Some controllers also periodically re-reconcile
everything to account for out of band changes they do not get notified for,
this is for example common for controllers managing cloud resources. In both
these cases, the reconciliation of new or changed objects gets delayed,
resulting in poor user experience. [Example][0]
2. There may be application-specific reason why some events are more important
than others, [Example][1]

## Proposed changes

Implement a priority queue in controller-runtime that exposes the following
interface:

```go
type PriorityQueue[T comparable] interface {
// AddWithOpts adds one or more items to the workqueue. Items
// in the workqueue are de-duplicated, so there will only ever
// be one entry for a given key.
// Adding an item that is already there may update its wait
// period to the lowest of existing and new wait period or
// its priority to the highest of existing and new priority.
AddWithOpts(o AddOpts, items ...T)
sbueringer marked this conversation as resolved.
Show resolved Hide resolved

// GetWithPriority returns an item and its priority. It allows
// a controller to re-use the priority if it enqueues an item
// again.
GetWithPriority() (item T, priority int, shutdown bool)
vincepri marked this conversation as resolved.
Show resolved Hide resolved

// workqueue.TypedRateLimitingInterface is kept for backwards
// compatibility.
workqueue.TypedRateLimitingInterface[T]
}

type AddOpts struct {
// After is a duration after which the object will be available for
// reconciliation. If the object is already in the workqueue, the
// lowest of existing and new After period will be used.
sbueringer marked this conversation as resolved.
Show resolved Hide resolved
After time.Duration

// Ratelimited specifies if the ratelimiter should be used to
// determine a wait period. If the object is already in the
// workqueue, the lowest of existing and new wait period will be
// used.
RateLimited bool

// Priority specifies the priority of the object. Objects with higher
// priority are returned before objects with lower priority. If the
// object is already in the workqueue, the priority will be updated
// to the highest of existing and new priority.
//
// The default value is 0.
Priority int
}
vincepri marked this conversation as resolved.
Show resolved Hide resolved
```

In order to fix the issue described in point one of the motivation section,
we have to be able to differentiate events stemming from the initial list
during startup and from resyncs from other events. For events from the initial
list, the informer emits a `Create` event whereas for `Resync` it emits an `Update`
event. The suggestion is to use a heuristic for `Create` events, if the object
in there is older than one minute, it is assumed to be from the initial `List`.
For the `Resync`, we simply check if the `ResourceVersion` is unchanged.
In both these cases, we will lower the priority to `LowPriority`/`-100`.
This gives some room for use-cases where people want to use a priority that
is lower than default (`0`) but higher than what we use in the wrapper.

```go
// WithLowPriorityWhenUnchanged wraps an existing handler and will
// reduce the priority of events stemming from the initial listwatch
// or cache resyncs to LowPriority.
func WithLowPriorityWhenUnchanged[object client.Object, request comparable](u TypedEventHandler[object, request]) TypedEventHandler[object, request]{
}
```

```go
// LowPriority is the priority set by WithLowPriorityWhenUnchanged
const LowPriority = -100
```

The issue described in point two of the motivation section ("application-specific
reasons to prioritize some events") will always require implementation of a custom
handler or eventsource in order to inject the appropriate priority.

## Implementation stages

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the default controller be modified to make use of this new queue (if used), or will it rely on using a custom controller implementation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it will be updated to make use of it - the only thing it needs is to re-use the priority though. Once we make it the default, I would also like to add a Priority parameter to the reconcile.Result - But really only once its enabled by default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also to the Request?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also to the Request?

Ah, interesting thought. Is the idea because you want to be able to set the priority in a handler?

The way this currently works is that the workqueue doesn't have any understanding of the request object (and we should keep it that way IMHO). We could probably provide a thin wrapper for it that will use the priority from the request if the AddWithOptions call doesn't have one yet and inject that in the builder if its typed to reconcile.Request?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that the priority could also be useful information for the Reconcile func. But maybe it's a bad idea not sure 😀

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Because then it can use that as an input when returning a priority?

Copy link
Member

@sbueringer sbueringer Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's one use case. Another one would be to reconcile Requests of different priorities differently :)

Maybe it makes sense in some cases to pass down the priority (if some other components are involved in reconciliation).

Maybe a controller would act differently if it can infer based on the priority if this is just a periodic resync (or something similar) vs an actual change. Or in general prioritize Requests with higher priority higher (if it has to schedule some "tasks" in other systems)

But I'm really not sure if this is a good idea or opening pandora's box :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's follow-up on this with more use cases later on, if needed


In order to safely roll this out to all controller-runtime users, it is suggested to
divide the implementation into two stages: Initially, we will add the priority queue
but mark it as experimental and all usage of it requires explicit opt-in by setting
a boolean on the manager or configuring `NewQueue` in a controllers opts. There will
be no breaking changes required for this, but sources or handlers that want to make
use of the new queue will have to use type assertions.

After we've gained some confidence that the implementation is useful and correct, we
will make it the default. Doing so entails breaking the `source.Source` and the
`handler.Handler` interfaces as well as the `controller.Options` struct to refer to
the new workqueue interface. We will wait at least one minor release after introducing
the `PriorityQueue` before doing this.


* [0]: https://youtu.be/AYNaaXlV8LQ?si=i2Pfo7Ske6rTrPLS
* [1]: https://github.com/cilium/cilium/blob/a17d6945b29c177209af3d985bd82cce49eed4a1/operator/pkg/ciliumendpointslice/controller.go#L73