Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to declare optional parts of the config? #10266

Open
yurishkuro opened this issue May 26, 2024 · 10 comments
Open

How to declare optional parts of the config? #10266

yurishkuro opened this issue May 26, 2024 · 10 comments

Comments

@yurishkuro
Copy link
Member

Slightly related to the discussion in open-telemetry/opentelemetry-collector-contrib#9478 (cc @mx-psi), but more about the conflict between a desire to have optional configs which nonetheless have defaults. Most obvious example is OTLP Receiver, which is often configured like this:

receivers:
  otlp:
    protocols:
      grpc:
      http:

Here the grpc/http subsections do not have any fields, in particular port numbers, because they are provided by createDefaultConfig(). However, it makes it look like one could simple remove e.g. http to disable creating an HTTP server, which is not actually true - because the default config still populates http even if it's absent in the config.

Now if there were no defaults, this would work fine - both http and grpc fields are declared as pointers and thus could be nil, indicating an absence. But this is completely circumvented by the createDefaultConfig() mechanism.

Motivation

  1. In case of OTLP exporter I might not want to support HTTP endpoint. Right now there's no way to disable it afaiu.
  2. In Jaeger components we often have a need to define alternate sub-sections of the config, e.g. remote sampling extension can be backed by a file or by an adaptive calculator. Each of the options requires its own configuration with defaults (e.g. file refresh interval).

Solution?

I don't have a clear one, but it would be nice to have more granular createDefault functions that would only be called once the config is read and a given section is seen as non-empty in the config. As I understand the first pass of the config parsing is by reading YAML into nested maps, which are then decoded into the objects.

@mx-psi
Copy link
Member

mx-psi commented May 27, 2024

However, it makes it look like one could simple remove e.g. http to disable creating an HTTP server, which is not actually true - because the default config still populates http even if it's absent in the config.

The way we work around this on the OTLP receiver is

if !conf.IsSet(protoGRPC) {
cfg.GRPC = nil
}
if !conf.IsSet(protoHTTP) {
cfg.HTTP = nil

I don't love that approach, but it has worked so far

@yurishkuro
Copy link
Member Author

Thanks for the pointer. Indeed, I was wrong in the issue description - OTLP receiver does disable http if the key is removed from the config. We will try to use the same approach in the Jaeger components.

If there are no plans or possibilities to have a better solution, feel free to close this issue.

@evan-bradley
Copy link
Contributor

I would prefer to solve this with a declarative solution instead of requiring manual Unmarshal implementations to do this. I see two possible approaches here:

  1. Implement a confmap.Optional type that implements its own Unmarshal method that checks the presence of the field. This would work similar to optional types in other languages and would provide methods for checking both field presence and getting the value of the underlying field.
  2. Put a struct field tag on optional fields and have the unmarshaler check for these when unmarshaling the struct.

I think both of these have trade-offs and I'm not sure which one is the better approach.

@yurishkuro
Copy link
Member Author

@evan-bradley re (2), how would that be different than just having a pointer? I think we always need a structural distinction between required and optional sub-configs, which is achieved via Optional type in (1).

(1) sounds like a good path. I supposed the components would still need to define the complete default configuration, but the confmap.Optional[SubConfig] points would erase those if not set in the file.

@evan-bradley
Copy link
Contributor

re (2), how would that be different than just having a pointer?

The problem is that we can also have pointers to config structs that are intended to be references and not just optional values. Combined with default values, this makes it ambiguous to the unmarshaler whether the value is required but has a default, or is truly optional and the pointer should be set to nil when the key isn't present. We would need to find a way to describe that extra bit of information.

Overall I like (1) more since the handling is more explicit, but it may come with trade-offs that make (2) a better overall solution.

@mx-psi
Copy link
Member

mx-psi commented May 29, 2024

@yurishkuro If you are willing to own this, I think we would be happy to review a PR that adds a wrapper like this to opentelemetry-collector (the concrete package where this would live is to be decided I guess, but we can discuss this on the PR).

@evan-bradley
Copy link
Contributor

evan-bradley commented May 29, 2024

Not sure if it helps, but OTTL has an Optional type we use, just as an example of how something like this may work. You can see it here.

@yurishkuro
Copy link
Member Author

I fiddled with this a bit in #10260

BTW, don't know why I opened the issue in contrib, we should move it to collector.

The issue I bumped into is that I could not find a good place to execute an override to erase default values if the input config omits the corresponding config section. The main confmap code uses decoder hooks, which fire at two relevant occasions

  1. first at the overall struct, e.g. otlpreceiver.Protocols which may have a field HTTP: Optional[HTTPConfig]
  2. they they fire for each field. This is where the custom Optional.Unmarshal is called, which is needed to delegate parsing to the underlying struct

The issue with (2) is that it only fires when there is a corresponding entry in the input config, so cannot be used to unset the default values already in the struct.

(1) is equivalent to the custom unmarshaler #10266. This could be the hook that we need, but it needs to:

  1. loop through all fields of a struct via reflection (including flattened/embedded fields)
  2. find those that are Optional
  3. find the corresponding key in the map (which can come from field tags)
  4. if the key is not set then erase the Optional

All of that is of course doable, but manually / from scratch (especially step 3), as all the same functionality done by mapstructure is private.

I also looked into using Metadata that can be populated by mapstructure.Decoder. It provides the name of the fields that were "unset" in the config. However, those keys are still the YAML names, so it's not easy to map them back to the corresponding field in the struct.

@evan-bradley evan-bradley transferred this issue from open-telemetry/opentelemetry-collector-contrib May 30, 2024
@mx-psi
Copy link
Member

mx-psi commented Jun 5, 2024

The issue with (2) is that it only fires when there is a corresponding entry in the input config, so cannot be used to unset the default values already in the struct.

One option is to put it on the configuration consumer to check if the value is none. I think this is what we currently do on the fields where we use 'pointer to T'.

Another option is to make a type that stores the default value (so, the same as your Optional but with a default value if unset). I don't have a good name for that (WithDefault[T]? WithFallback[T]?), but it would be simple to implement.

yurishkuro added a commit to yurishkuro/opentelemetry-collector that referenced this issue Jul 14, 2024
@yurishkuro
Copy link
Member Author

@mx-psi WithDefault[T]() was a good idea, but it didn't work either. When an item is defined in YAML but without any value, i.e. item: with no content, then mapstructure treats the value as nil, and does not perform any processing. As a result, the custom Unmarshal() is still not being called.

See test https://github.com/open-telemetry/opentelemetry-collector/pull/10260/files#diff-24c5961a7871021bd9801008aa5f1fad72efd43f8783a00c7d09dc52c4c64e0dR81

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants