Allow users to specify dataset and namespace for rerouting data collected by kubernetes.container_logs datastream #6845

gsantoro · 2023-07-06T15:37:56Z

As a follow-up of #5988 we want to avoid adding default routing rules in our integrations. Instead, we would like to give the user the ability to specify labels in Kubernetes containers that would be used to reroute the traffic.

In contrast to previous considerations, we’re not creating a data stream per service name as this would lead to a lot of overhead for customers with thousands of services.

The reroute processor was introduced in this PR elastic/elasticsearch#76511 and it has been available since 8.8.0.

Currently, you can add the reroute processor manually to an ingest pipeline with an if condition to reroute traffic to a destination dataset and namespace.

This is a quite manual process at the moment.

In order to improve the experience for the end user we could:

Define some standard Kubernetes labels (for example elastic.co/dataset and elastic.co/namespace) that if present could be used to reroute the traffic automatically without the need to define a custom pipeline defined by the user. The values from those labels would end up into the fields data_stream.dataset and data_stream.namespace and a default routing rule will use them to reroute the traffic. Since the reroute processor has to be added to an ingest pipeline, that means that integrations that use those Kubernetes labels should have an ingest pipeline that checks for the presence of those container labels and reroute if those are present. We will have to evaluate what's the performance hit of having these extra steps always running. Since the benefits are quite significant, maybe that's worth it.
we should also extract the fields service.name and service.version from the well knows Kubernetes labels app.kubernetes.io/name and app.kubernetes.io/version. Alternatively if those are not provided we should infer the service name from the field container.name and leave out the service.version field.

The text was updated successfully, but these errors were encountered:

felixbarny · 2023-07-13T08:36:10Z

This is unblocked now that specifying local routing rules are supported in the package spec and in Fleet

The minimum stack version has to be set to 8.10, though, as that's the version where Fleet supports that feature.

If available, the reroute processor uses the pod's dataset and namespace labels, fallback to the values configured in the agent policy. refs: elastic#6845

`service.name` should use value from the label `app.kubernetes.io/name` first, and then fallback to the `kubernetes.container.name` if not present. I need to double-check if I can use the container name as is of I need to parse it in some form. `service.version` use value from the label `app.kubernetes.io/version`, if present. refs: elastic#6845

gsantoro self-assigned this Jul 6, 2023

gsantoro mentioned this issue Jul 6, 2023

Add default routing rules to sink-type integrations #5988

Closed

gsantoro changed the title ~~Allow users to add custom routing rules in an integration~~ Allow users to specify dataset and namespace for rerouting data collected by kubernetes.container_logs datastream Jul 7, 2023

gsantoro assigned zmoog Jul 10, 2023

felixbarny mentioned this issue Jul 13, 2023

[Discuss] Add service.name to service based integrations #6295

Open

zmoog mentioned this issue Jul 24, 2023

[Kubernetes] Reroute container logs based on pod annotations #7118

Merged

14 tasks

felixbarny linked a pull request Jul 28, 2023 that will close this issue

[Kubernetes] Reroute container logs based on pod annotations #7118

Merged

14 tasks

zmoog added a commit to zmoog/integrations that referenced this issue Sep 5, 2023

Add ingest pipeline for rerouting

4637d62

If available, the reroute processor uses the pod's dataset and namespace labels, fallback to the values configured in the agent policy. refs: elastic#6845

zmoog closed this as completed in #7118 Sep 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow users to specify dataset and namespace for rerouting data collected by kubernetes.container_logs datastream #6845

Allow users to specify dataset and namespace for rerouting data collected by kubernetes.container_logs datastream #6845

gsantoro commented Jul 6, 2023 •

edited

Loading

felixbarny commented Jul 13, 2023

Allow users to specify dataset and namespace for rerouting data collected by kubernetes.container_logs datastream #6845

Allow users to specify dataset and namespace for rerouting data collected by kubernetes.container_logs datastream #6845

Comments

gsantoro commented Jul 6, 2023 • edited Loading

felixbarny commented Jul 13, 2023

gsantoro commented Jul 6, 2023 •

edited

Loading