-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[connector/routing] When matching multiple conditions, build a new consumer each time. #29882
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Pinging code owners for connector/routing: @jpkrohling @mwear. See Adding Labels via Comments if you do not have permissions to add labels yourself. |
@mwear, are you interested in taking a look at this one? |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
I've been working on the routing connector lately and revisiting this issue I believe it is a very difficult one to solve performantly. When using I'm a bit skeptical that this is really worth it but I'd like to hear from others. Does anyone actually rely on For what it's worth, I believe it's technically possible to achieve the routing:
match_once: false
table:
- condition: attributes["env"] == "prod"
pipelines: [ logs/prod ]
- condition: attributes["env"] == "dev"
pipelines: [ logs/dev ]
- condition: attributes["region"] == "east"
pipelines: [ logs/east ]
- condition: attributes["region"] == "west"
pipelines: [ logs/west ] If there is no default case that needs to be handled, than the user can route the logs to two separate routers: routing/env:
match_once: true
table:
- condition: attributes["env"] == "prod"
pipelines: [ logs/prod ]
- condition: attributes["env"] == "dev"
pipelines: [ logs/dev ]
routing/region:
match_once: true
table:
- condition: attributes["region"] == "east"
pipelines: [ logs/east ]
- condition: attributes["region"] == "west"
pipelines: [ logs/west ] If there is a default case, then it is a bit trickier, but the user can enumerate the combination of cases: routing:
match_once: true
table:
- condition: attributes["env"] == "prod" and attributes["region"] == "east"
pipelines: [ logs/prod, logs/east ]
- condition: attributes["env"] == "prod" and attributes["region"] == "west"
pipelines: [ logs/prod, logs/west ]
- condition: attributes["env"] == "dev" and attributes["region"] == "east"
pipelines: [ logs/dev, logs/east ]
- condition: attributes["env"] == "dev" and attributes["region"] == "west"
pipelines: [ logs/dev, logs/west ]
default_pipelines: [ logs/default ] This may not be scaleable if there are many dimensions, but another way to solve would be to place a "default router" before the handoff to the one-dimension routers. This "default router" would have almost exactly the same config as the the original routing:
match_once: true
table:
- condition: attributes["env"] == "prod"
pipelines: [ logs/env, logs/region ] # forward to both "routing/env" and "routing/region"
- condition: attributes["env"] == "dev"
pipelines: [ logs/env, logs/region ]
- condition: attributes["region"] == "east"
pipelines: [ logs/env, logs/region ]
- condition: attributes["region"] == "west"
pipelines: [ logs/env, logs/region ]
default_pipelines: [ logs/default ] # handle logs that matched no dimensions
routing/env:
match_once: true
table:
- condition: attributes["env"] == "prod"
pipelines: [ logs/prod ]
- condition: attributes["env"] == "dev"
pipelines: [ logs/dev ]
routing/region:
match_once: true
table:
- condition: attributes["region"] == "east"
pipelines: [ logs/east ]
- condition: attributes["region"] == "west"
pipelines: [ logs/west ] Let's hear from others first, but if there are no objections I propose we should deprecate the option by approximating a feature gate:
|
My team relies on |
Thanks @sirianni. What do you think about the alternatives that I proposed above? If you want to share an example configuration, I would be interested in trying to convert it to produce the same result as |
It's doable, but, as you said, forces the user to enumerate all of the possible combinations explicitly which increases configuration complexity. As you pointed out, the alternative would be to have the code itself build a |
It's worse than that actually, because we also need to create a mechanism for determining which of these compound consumers needs to be used. |
For reference, this was added as part of this PR and does NOT exist as part of the processor: |
As much as I don't want to break anyone and would like to support the At a minimum, I think we should proceed with deprecation and changing the default to Thoughts? |
I agree that the best course of action is to deprecate/remove that feature, at least for the moment. |
This PR deprecates the `match_once` parameter. It defines a multi-step process which hopefully gives users plenty of time to make necessary changes. It also provides several detailed examples of how to migrate a configuration. Resolves #29882
@djaglowski thanks for your work on this component. I'm thinking through this and going around in circles a bit 🙂 . Is there a practical difference between routing/region:
match_once: true
table:
- condition: attributes["region"] == "east"
pipelines: [ logs/east ]
- condition: attributes["region"] == "west"
pipelines: [ logs/west ] vs. ...
exporters: [ forward/logs ]
logs_east:
receivers: [ foward/logs ]
processors: [ filter/logs_east ]
exporters: [ ... ]
logs_west:
receivers: [ foward/logs ]
processors: [ filter/logs_west ]
exporters: [ ... ] |
…36824) This PR deprecates the `match_once` parameter. It defines a multi-step process which hopefully gives users plenty of time to make necessary changes. It also provides several detailed examples of how to migrate a configuration. Resolves open-telemetry#29882
@sirianni, thanks for understanding the need here. There is a slight difference between using routing vs forward/filter, in that forward/filter will cause data to be copied. (N-1 copies will be made, where N is the number of pipelines you are forwarding to.) This may be less of a concern if you are migrating from |
🤔 I thought it didn't make copies which is this bug...? |
I guess I'm poking a bit on this part from your README
service:
pipelines:
logs/in::exporters: [routing/env, routing/region] Why would this happen given that the
|
…36824) This PR deprecates the `match_once` parameter. It defines a multi-step process which hopefully gives users plenty of time to make necessary changes. It also provides several detailed examples of how to migrate a configuration. Resolves open-telemetry#29882
I'm also confused about this. The forwardconnector also sets MutatesData: false so how does it ensure that each destination pipeline gets its own copy of the data? |
…36824) This PR deprecates the `match_once` parameter. It defines a multi-step process which hopefully gives users plenty of time to make necessary changes. It also provides several detailed examples of how to migrate a configuration. Resolves open-telemetry#29882
Related question which can be a separate issue. Is the other (match multiple routes) approach implemented correctly? I am wondering if we are not properly fanning out the data.
For example, if we match routes 1 and 2, we cannot send the same copy of the data to both consumers. We need to use a fanout. In other words, we should build a consumer which contains all the pipelines in both 1 and 2, and send to that.
Originally posted by @djaglowski in #28888 (comment)
The text was updated successfully, but these errors were encountered: