-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet] Add support for routing rules in integrations #514
Comments
routing_rules.yml
file in integrations
By reviewing this definition, I have some questions about it:
|
I think it makes sense to add this during Fleet installation to avoid repeating in every package, it seems mandatory if
I think it's good to use the existing data stream.
I think it makes sense to add to its own file.
I think array of objects sounds good. |
Just one more question about this. Should |
I found that Fleet sets the default dataset name as I think it's a good idea to make |
Once #535 has been completed.
And this is one example of the contents/syntax of this file: # "Local" routing rules are included under this current dataset, not a special case
- source_dataset: nginx
# Route error logs to `nginx.error` when they're sourced from an error logfile
- target_dataset: nginx.error
if: "ctx?.file?.path?.contains('/var/log/nginx/error')"
namespace:
- {{labels.data_stream.namespace}}
- default
# Route access logs to `nginx.access` when they're sourced from an access logfile
- target_dataset: nginx.access
if: "ctx?.file?.path?.contains('/var/log/nginx/access')"
namespace:
- {{labels.data_stream.namespace}}
- default
# Route K8's container logs to this catch-all dataset for further routing
- source_dataset: k8s.router
- target_dataset: nginx
if: "ctx?.container?.image?.name == 'nginx'"
namespace:
- {{labels.data_stream.namespace}}
- default
# Route syslog entries tagged with nginx to this catch-all dataset
- source_dataset: syslog
- target_dataset: nginx
if: "ctx?.tags?.contains('nginx')"
namespace:
- {{labels.data_stream.namespace}}
- default |
@mrodm the |
@zmoog Exactly, routing rules are going to be defined in its own file in each datastream (not in the manifest of the datastream): |
Instead of using the resource processor, we are switching to the new routing rules[^1] available in 8.10. The routing rules allow Fleet to build a better pipeline where the custom pipeline is executed *before* routing the document to a different dataset or namespace. Here's an example of the final pipeline created by Fleet after the integration installation: ```json [ { "set": { "field": "service.name", "copy_from": "kubernetes.labels.app_kubernetes_io/name", "ignore_empty_value": true } }, { "set": { "field": "service.name", "copy_from": "kubernetes.container.name", "override": false, "ignore_empty_value": true } }, { "set": { "field": "service.version", "copy_from": "kubernetes.labels.app_kubernetes_io/version", "ignore_empty_value": true } }, { "pipeline": { "name": "logs-kubernetes.container_logs@custom", "ignore_missing_pipeline": true } }, { "reroute": { "tag": "kubernetes.container_logs", "dataset": [ "{{kubernetes.labels.elastic_co/dataset}}", "{{data_stream.dataset}}", "kubernetes.container_logs" ], "namespace": [ "{{kubernetes.labels.elastic_co/namespace}}", "{{data_stream.namespace}}", "default" ], "if": "ctx?.kubernetes?.labels != null" } } ] ``` We upgrade the package-spec to 2.9.0 to enable the routing rules. [^1]: elastic/package-spec#514 refs: elastic#7118
* Add pipeline failure handler * Set service.name and service.version `service.name` should use value from the label `app.kubernetes.io/name` first, and then fallback to the `kubernetes.container.name` if not present. I need to double-check if I can use the container name as is of I need to parse it in some form. `service.version` use value from the label `app.kubernetes.io/version`, if present. * Add the routing rules Instead of using the resource processor, we will use the new routing rules[^1] available in 8.10. The routing rules allow Fleet to build a better pipeline where the custom pipeline is executed *before* routing the document to a different dataset or namespace. Here's an example of the final pipeline created by Fleet after the integration installation: ```json [ { "set": { "field": "service.name", "copy_from": "kubernetes.labels.app_kubernetes_io/name", "ignore_empty_value": true } }, { "set": { "field": "service.name", "copy_from": "kubernetes.container.name", "override": false, "ignore_empty_value": true } }, { "set": { "field": "service.version", "copy_from": "kubernetes.labels.app_kubernetes_io/version", "ignore_empty_value": true } }, { "pipeline": { "name": "logs-kubernetes.container_logs@custom", "ignore_missing_pipeline": true } }, { "reroute": { "tag": "kubernetes.container_logs", "dataset": [ "{{kubernetes.labels.elastic_co/dataset}}", "{{data_stream.dataset}}", "kubernetes.container_logs" ], "namespace": [ "{{kubernetes.labels.elastic_co/namespace}}", "{{data_stream.namespace}}", "default" ], "if": "ctx?.kubernetes?.labels != null" } } ] ``` We upgrade the package-spec to 2.9.0 to enable the routing rules. [^1]: elastic/package-spec#514 refs: #7118 * Expand the rerouting docs The docs now focus on describing what the routing offers and how users can customize it setting pod annotations. We offer an example at definition time (using a deployment) and runtime (using `kubectl`). * Mention container-logs routing in the README The main README file is what most users will see before and after installing the integration. Adding a short mention of the container-logs routing capability, with a link to the complete docs, could improve the discoverability of this feature without too much noise. * Docs: add a namespace customization example Show how to customize the namespace setting a label on the pod. * Update docs Rephrase the Nginx example to avoid ambiguity; the Nginx integration is not required for the routing purpose. Update the pod labels table to avoid ambiguity about the target namespace; it's the data stream namespace, not the k8s namespace. * Switch from labels to annotations We learned that Kubernetes annotations are the correct representation for data such as routing rules. The annotations docs[^1] mention the following use case for annotations: > "Directives from the end-user to the implementations to modify > behavior or engage non-standard features." So, we switch from labels to annotations. Unfortunately, the Kubernetes provider[^2] does not add annotations to the event out-of-the-box, and we can't enable this on Fleet-managed agents. So, we decided to make the relevant annotations available in the event adding field[^3] using Filebeat processors. We decided to keep the `app.kubernetes.io/name` and `app.kubernetes.io/version` metadata as labels since the Recommended Labels[^4] document mentions them. [^1]: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/#attaching-metadata-to-objects [^2]: https://www.elastic.co/guide/en/fleet/current/kubernetes-provider.html [^3]: https://www.elastic.co/guide/en/beats/filebeat/current/add-fields.html [^4]: https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/ * Explain WHY we added processors for annotations Add a simple section that explains why we had to add a few Filebeat processors to *export* routing-focused annotations from the Kubernetes provider to the event. --------- Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com> Co-authored-by: Chris Mark <chrismarkou92@gmail.com>
In order to support document-based routing in Fleet, integrations need to expose their routing rules as part of their data stream manifest files.
These rules would be translated to
reroute
processors appropriately by Fleet during integration installation.A routing rule is composed of a few pieces of data:
We'll need to support two types of routing rules defined by an integration:
As far as integrations are concerned, though, there is no meaningful difference between writing a "local" routing rule and an "injected" routing rule in a data stream manifest. Fleet will be responsible for generating the appropriate
processors
in the appropriate ingest pipelines based on these rules. So, the implementation on the package-spec side will be a genericrouting_rules
object at the data stream manifest level.Typically, routing rules will be defined on a "catch-all" or "data sink" style dataset like
kubernetes.router
that is essentially a passthrough to more specific data streams.For example, we might have an
nginx
catch-all dataset that routes Nginx logs to more specific data sets likenginx.error
andnginx.access
based on the logfile path reported in each document.Here's a proposed example of the above in action. Please see the annotative comments for more details:
Fleet support will be implemented as follow:
The text was updated successfully, but these errors were encountered: