Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mdatagen: add wildcard name matching for configs #10065

Closed
wants to merge 8 commits into from

Conversation

braydonk
Copy link
Contributor

@braydonk braydonk commented May 1, 2024

Description

This PR adds support for wildcard name matching.

Generated MetricsConfig and ResourceAttributesConfig will now support providing names not just as full names, but also using * wildcards and multimatching with {x,y,etc}. This allows you to apply configs to groups of metrics and resource attributes, simplifying configs.

Link to tracking issue

Fixes #10074

Testing

I go installed mdatagen from this PR and ran it on a receiver in opentelemetry-collector-contrib and was able to use wildcard matching to enable/disable groups of metrics as expected.

Documentation

I have written a specification in a gist: https://gist.github.com/braydonk/ccb6775331fdd5dca91a650330b9839f

I am not sure where this documentation should live.

@braydonk braydonk requested review from a team and Aneurysm9 May 1, 2024 19:26
@braydonk braydonk force-pushed the wildcard_names branch 2 times, most recently from 026ad18 to 2ed7ec4 Compare May 1, 2024 20:07
Copy link

codecov bot commented May 1, 2024

Codecov Report

Attention: Patch coverage is 33.33333% with 8 lines in your changes missing coverage. Please review.

Project coverage is 92.43%. Comparing base (c7c3401) to head (d82df38).
Report is 119 commits behind head on main.

Current head d82df38 differs from pull request most recent head a522523

Please upload reports for the commit a522523 to get more accurate results.

Files Patch % Lines
cmd/mdatagen/main.go 33.33% 4 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10065      +/-   ##
==========================================
- Coverage   92.59%   92.43%   -0.16%     
==========================================
  Files         387      387              
  Lines       18198    18264      +66     
==========================================
+ Hits        16850    16883      +33     
- Misses       1007     1031      +24     
- Partials      341      350       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@braydonk
Copy link
Contributor Author

braydonk commented May 1, 2024

I think that code coverage change is fine here because none of the return err cases for template file generation are covered anyway.

@braydonk
Copy link
Contributor Author

braydonk commented May 2, 2024

The changelog issue can only be resolved once the issue is moved from contrib to here.

cmd/mdatagen/templates/match.go.tmpl Outdated Show resolved Hide resolved
cmd/mdatagen/templates/match.go.tmpl Show resolved Hide resolved
cmd/mdatagen/templates/match.go.tmpl Outdated Show resolved Hide resolved
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label May 19, 2024
@dmitryax dmitryax removed the Stale label May 21, 2024
@dmitryax
Copy link
Member

@braydonk, are you still working on this? Just checking. Let me know if you have any concerns with my comments

@braydonk
Copy link
Contributor Author

Hi @dmitryax yes I am. I realized that I forgot to mention that I am writing a full spec to explain how it works, and while I was doing that found a number of edge cases that prompted me to rewrite the parser. Taking longer than expected. I am still working on this though and will keep it unstale.

braydonk added 4 commits May 28, 2024 15:59
This PR adds support for wildcard name matching.

Generated MetricsConfig and ResourceAttributesConfig will now support
providing names not just as full names, but also using `*` wildcards and
multimatching with `{x,y,etc}`. This allows you to apply configs to
groups of metrics and resource attributes, simplifying configs.

Signed-off-by: braydonk <braydonk@google.com>
@braydonk
Copy link
Contributor Author

braydonk commented May 28, 2024

What the actual errors look like from collector output is not super obvious from the PR, so here are some examples. I used the hostmetrics process scraper in my example.

Not matching any names with the patterns

hostmetrics:
  collection_interval: 10s
  scrapers:
    process:
      metrics:
        "nothing.*":
          enabled: false
$ ./otelcontribcol --config config.yaml 
Error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'receivers': error reading configuration for "hostmetrics": error reading settings for scraper type "process": 1 error(s) decoding:

* error decoding 'metrics': specified patterns don't match any names
2024/05/28 16:06:44 collector server run finished with error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'receivers': error reading configuration for "hostmetrics": error reading settings for scraper type "process": 1 error(s) decoding:

* error decoding 'metrics': specified patterns don't match any names

Parsing errors in a pattern

hostmetrics:
  collection_interval: 10s
  scrapers:
    process:
      metrics:
        "a_.*":
          enabled: false
$ ./otelcontribcol --config config.yaml
Error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'receivers': error reading configuration for "hostmetrics": error reading settings for scraper type "process": 1 error(s) decoding:

* error decoding 'metrics': pattern parsing error:
a_.*
  ^
invalid pattern: unexpected token
2024/05/28 16:06:52 collector server run finished with error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'receivers': error reading configuration for "hostmetrics": error reading settings for scraper type "process": 1 error(s) decoding:

* error decoding 'metrics': pattern parsing error:
a_.*
  ^
invalid pattern: unexpected token

@braydonk
Copy link
Contributor Author

I'm gonna wait to address the conflicts, cause this is just going to keep coming up continuously throughout the review.

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jun 12, 2024
@braydonk
Copy link
Contributor Author

Not stale

@dmitryax dmitryax removed the Stale label Jun 13, 2024
@@ -10,6 +10,12 @@ import (
)
{{ if .Metrics -}}

var MetricNames = []string{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to expose this variable if it's not intended to be used outside. It can be rendered even right in expandPatternMap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch thanks, this was an accident

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in a758241

I couldn't render directly in expandPatternMap because there could be either metrics or resource attributes passed in. I render both arrays more locally to their usage though instead of as globals.

// or end of input are valid.
if !p.isFinished() {
curr, _ := p.current()
if curr != '.' {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow a group as a whole to be either an identifier, a group, or a wildcard, can we use strings.Split(pattern, '.') and just read each part instead of having this loop? I think it should simplify the code.

I like the approach you took. However, even if we do the parsing ourselves, I'd like to keep the code simpler if possible.

Copy link
Contributor Author

@braydonk braydonk Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give that a shot. When I first tried that I was having trouble making nice error messages; currently the error messages can track the exact token that is causing a problem. I think I can come up with a way to preserve that though, will give it a try soon

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this but I found it was becoming equally as complex. The complexity saved by this approach is:

  • Shared scanner object, so I'm not needing to pass around indexes for context to different spots
  • I can share identifier parsing code between an identifier as a root group, and an identifier as part of a multimatch

How strongly do you feel on this? I can try to make it work if you think it's the way to go.

@@ -124,6 +124,23 @@ func run(ymlPath string) error {
toGenerate[filepath.Join(tmplDir, "resource.go.tmpl")] = filepath.Join(codeDir, "generated_resource.go")
toGenerate[filepath.Join(tmplDir, "resource_test.go.tmpl")] = filepath.Join(codeDir, "generated_resource_test.go")
}
if err = generateFile(filepath.Join(tmplDir, "match.go.tmpl"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be generated for every component in contrib. We need to generate these files only for scrapers len(md.Metrics) > 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in f416166

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction in a522523

Copy link
Contributor

github-actions bot commented Jul 9, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jul 9, 2024
Copy link
Contributor

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[cmd/mdatagen] Ability to disable/enable all metrics
2 participants