-
Notifications
You must be signed in to change notification settings - Fork 137
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
- Loading branch information
Showing
5 changed files
with
1,831 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# notification.toolkit.fluxcd.io/v1beta3 | ||
|
||
This is the v1beta3 API specification for defining events handling. | ||
|
||
## Specification | ||
|
||
* [Alerts](alerts.md) | ||
* [Events](events.md) | ||
* [Providers](providers.md) | ||
|
||
## Go Client | ||
|
||
* [github.com/fluxcd/pkg/runtime/events](https://pkg.go.dev/github.com/fluxcd/pkg/runtime/events) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,250 @@ | ||
# Alerts | ||
|
||
<!-- menuweight:10 --> | ||
|
||
The `Alert` API defines how events are filtered by severity and involved object, and what provider to use for dispatching. | ||
|
||
## Example | ||
|
||
The following is an example of how to send alerts to Slack when Flux fails to reconcile the `flux-system` namespace. | ||
|
||
```yaml | ||
--- | ||
apiVersion: notification.toolkit.fluxcd.io/v1beta3 | ||
kind: Provider | ||
metadata: | ||
name: slack-bot | ||
namespace: flux-system | ||
spec: | ||
type: slack | ||
channel: general | ||
address: https://slack.com/api/chat.postMessage | ||
secretRef: | ||
name: slack-bot-token | ||
--- | ||
apiVersion: notification.toolkit.fluxcd.io/v1beta3 | ||
kind: Alert | ||
metadata: | ||
name: slack | ||
namespace: flux-system | ||
spec: | ||
summary: "Cluster addons impacted in us-east-2" | ||
providerRef: | ||
name: slack-bot | ||
eventSeverity: error | ||
eventSources: | ||
- kind: GitRepository | ||
name: '*' | ||
- kind: Kustomization | ||
name: '*' | ||
``` | ||
In the above example: | ||
- A Provider named `slack-bot` is created, indicated by the | ||
`Provider.metadata.name` field. | ||
- An Alert named `slack` is created, indicated by the | ||
`Alert.metadata.name` field. | ||
- The Alert references the `slack-bot` provider, indicated by the | ||
`Alert.spec.providerRef` field. | ||
- The notification-controller starts listening for events sent for | ||
all GitRepositories and Kustomizations in the `flux-system` namespace. | ||
- When an event with severity `error` is received, the controller posts | ||
a message on Slack channel from `.spec.channel`, | ||
containing the `summary` text and the reconciliation error. | ||
|
||
You can run this example by saving the manifests into `slack-alerts.yaml`. | ||
|
||
1. First create a secret with the Slack bot token: | ||
|
||
```sh | ||
kubectl -n flux-system create secret generic slack-bot-token --from-literal=token=xoxb-YOUR-TOKEN | ||
``` | ||
|
||
2. Apply the resources on the cluster: | ||
|
||
```sh | ||
kubectl -n flux-system apply --server-side -f slack-alerts.yaml | ||
``` | ||
|
||
## Writing an Alert spec | ||
|
||
As with all other Kubernetes config, an Alert needs `apiVersion`, | ||
`kind`, and `metadata` fields. The name of an Alert object must be a | ||
valid [DNS subdomain name](https://kubernetes.io/docs/concepts/overview/working-with-objects/names#dns-subdomain-names). | ||
|
||
An Alert also needs a | ||
[`.spec` section](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status). | ||
|
||
### Summary | ||
|
||
`.spec.summary` is an optional field to specify a short description of the | ||
impact and affected cluster. | ||
|
||
The summary max length can't be greater than 255 characters. | ||
|
||
### Provider reference | ||
|
||
`.spec.providerRef.name` is a required field to specify a name reference to a | ||
[Provider](providers.md) in the same namespace as the Alert. | ||
|
||
### Event sources | ||
|
||
`.spec.eventSources` is a required field to specify a list of references to | ||
Flux objects for which events are forwarded to the alert provider API. | ||
|
||
To select events issued by Flux objects, each entry in the `.spec.eventSources` list | ||
must contain the following fields: | ||
|
||
- `kind` is the Flux Custom Resource Kind such as GitRepository, HelmRelease, Kustomization, etc. | ||
- `name` is the Flux Custom Resource `.metadata.name`, or it can be set to the `*` wildcard. | ||
- `namespace` is the Flux Custom Resource `.metadata.namespace`. | ||
When not specified, the Alert `.metadata.namespace` is used instead. | ||
|
||
#### Select objects by name | ||
|
||
To select events issued by a single Flux object, set the `kind`, `name` and `namespace`: | ||
|
||
```yaml | ||
eventSources: | ||
- kind: GitRepository | ||
name: webapp | ||
namespace: apps | ||
``` | ||
|
||
#### Select all objects in a namespace | ||
|
||
The `*` wildcard can be used to select events issued by all Flux objects of a particular `kind` in a `namespace`: | ||
|
||
```yaml | ||
eventSources: | ||
- kind: HelmRelease | ||
name: '*' | ||
namespace: apps | ||
``` | ||
|
||
#### Select objects by label | ||
|
||
To select events issued by all Flux objects of a particular `kind` with specific `labels`: | ||
|
||
```yaml | ||
eventSources: | ||
- kind: HelmRelease | ||
name: '*' | ||
namespace: apps | ||
matchLabels: | ||
team: app-dev | ||
``` | ||
|
||
#### Disable cross-namespace selectors | ||
|
||
**Note:** On multi-tenant clusters, platform admins can disable cross-namespace references by | ||
starting the controller with the `--no-cross-namespace-refs=true` flag. | ||
When this flag is set, alerts can only refer to event sources in the same namespace as the alert object, | ||
preventing tenants from subscribing to another tenant's events. | ||
|
||
### Event metadata | ||
|
||
`.spec.eventMetadata` is an optional field for adding metadata to events dispatched by | ||
the controller. This can be used for enhancing the context of the event. If a field | ||
would override one already present on the original event as generated by the emitter, | ||
then the override doesn't happen, i.e. the original value is preserved, and an info | ||
log is printed. | ||
|
||
#### Example | ||
|
||
Add metadata fields to successful `HelmRelease` events: | ||
|
||
```yaml | ||
--- | ||
apiVersion: notification.toolkit.fluxcd.io/v1beta3 | ||
kind: Alert | ||
metadata: | ||
name: <name> | ||
spec: | ||
eventSources: | ||
- kind: HelmRelease | ||
name: '*' | ||
inclusionList: | ||
- ".*succeeded.*" | ||
eventMetadata: | ||
app.kubernetes.io/env: "production" | ||
app.kubernetes.io/cluster: "my-cluster" | ||
app.kubernetes.io/region: "us-east-1" | ||
``` | ||
|
||
### Event severity | ||
|
||
`.spec.eventSeverity` is an optional field to filter events based on severity. When not specified, or | ||
when the value is set to `info`, all events are forwarded to the alert provider API, including errors. | ||
To receive alerts only on errors, set the field value to `error`. | ||
|
||
### Event exclusion | ||
|
||
`.spec.exclusionList` is an optional field to specify a list of regex expressions to filter | ||
events based on message content. The event will be excluded if the message matches at least | ||
one of the expressions in the list. | ||
|
||
#### Example | ||
|
||
Skip alerting if the message matches a [Go regex](https://golang.org/pkg/regexp/syntax) | ||
from the exclusion list: | ||
|
||
```yaml | ||
--- | ||
apiVersion: notification.toolkit.fluxcd.io/v1beta3 | ||
kind: Alert | ||
metadata: | ||
name: <name> | ||
spec: | ||
eventSources: | ||
- kind: GitRepository | ||
name: '*' | ||
exclusionList: | ||
- "waiting.*socket" | ||
``` | ||
|
||
The above definition will not send alerts for transient Git clone errors like: | ||
|
||
```text | ||
unable to clone 'ssh://git@ssh.dev.azure.com/v3/...', error: SSH could not read data: Error waiting on socket | ||
``` | ||
|
||
### Event inclusion | ||
|
||
`.spec.inclusionList` is an optional field to specify a list of regex expressions to filter | ||
events based on message content. The event will be sent if the message matches at least one | ||
of the expressions in the list, and discarded otherwise. If the message matches one of the | ||
expressions in the inclusion list but also matches one of the expressions in the exclusion | ||
list, then the event is still discarded (exclusion is stronger than inclusion). | ||
|
||
#### Example | ||
|
||
Alert if the message matches a [Go regex](https://golang.org/pkg/regexp/syntax) | ||
from the inclusion list: | ||
|
||
```yaml | ||
--- | ||
apiVersion: notification.toolkit.fluxcd.io/v1beta3 | ||
kind: Alert | ||
metadata: | ||
name: <name> | ||
spec: | ||
eventSources: | ||
- kind: HelmRelease | ||
name: '*' | ||
inclusionList: | ||
- ".*succeeded.*" | ||
exclusionList: | ||
- ".*uninstall.*" | ||
- ".*test.*" | ||
``` | ||
|
||
The above definition will send alerts for successful Helm installs, upgrades and rollbacks, | ||
but not uninstalls and tests. | ||
|
||
### Suspend | ||
|
||
`.spec.suspend` is an optional field to suspend the altering. | ||
When set to `true`, the controller will stop processing events. | ||
When the field is set to `false` or removed, it will resume. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
# Events | ||
|
||
<!-- menuweight:20 --> | ||
|
||
The `Event` API defines the structure of the events issued by Flux controllers. | ||
|
||
Flux controllers use the [fluxcd/pkg/runtime/events](https://github.com/fluxcd/pkg/tree/main/runtime/events) | ||
package to push events to the notification-controller API. | ||
|
||
## Example | ||
|
||
The following is an example of an event sent by kustomize-controller to report a reconciliation error. | ||
|
||
```json | ||
{ | ||
"involvedObject": { | ||
"apiVersion": "kustomize.toolkit.fluxcd.io/v1", | ||
"kind": "Kustomization", | ||
"name": "webapp", | ||
"namespace": "apps", | ||
"uid": "7d0cdc51-ddcf-4743-b223-83ca5c699632" | ||
}, | ||
"metadata": { | ||
"kustomize.toolkit.fluxcd.io/revision": "main/731f7eaddfb6af01cb2173e18f0f75b0ba780ef1" | ||
}, | ||
"severity":"error", | ||
"reason": "ValidationFailed", | ||
"message":"service/apps/webapp validation error: spec.type: Unsupported value: Ingress", | ||
"reportingController":"kustomize-controller", | ||
"timestamp":"2022-10-28T07:26:19Z" | ||
} | ||
``` | ||
|
||
In the above example: | ||
|
||
- An event is issued by kustomize-controller for a specific object, indicated in the | ||
`involvedObject` field. | ||
- The notification-controller receives the event and finds the [alerts](alerts.md) | ||
that match the `involvedObject` and `severity` values. | ||
- For all matching alerts, the controller posts the `message` and the source revision | ||
extracted from `metadata` to the alert provider API. | ||
|
||
## Event structure | ||
|
||
The Go type that defines the event structure can be found in the | ||
[fluxcd/pkg/apis/event/v1beta1](https://github.com/fluxcd/pkg/blob/main/apis/event/v1beta1/event.go) | ||
package. | ||
|
||
## Rate limiting | ||
|
||
Events received by notification-controller are subject to rate limiting to reduce the | ||
amount of duplicate alerts sent to external systems like Slack, Sentry, etc. | ||
|
||
Events are rate limited based on `involvedObject.name`, `involvedObject.namespace`, | ||
`involvedObject.kind`, `message`, and `metadata`. | ||
The interval of the rate limit is set by default to `5m` but can be configured | ||
with the `--rate-limit-interval` controller flag. | ||
|
||
The event server exposes HTTP request metrics to track the amount of rate limited events. | ||
The following promql will get the rate at which requests are rate limited: | ||
|
||
``` | ||
rate(gotk_event_http_request_duration_seconds_count{code="429"}[30s]) | ||
``` |
Oops, something went wrong.