Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scrapeFailureLogLevel config to smartagent prometheus receivers #3260

Merged
merged 7 commits into from
Jun 16, 2023

Conversation

jvoravong
Copy link
Contributor

  • Adds a new configuration to the smartagent prometheus receivers to log scrape failures at different levels.
  • Prometheus by default logs scrape failures at the debug level (see Debug logs for failed scrapes prometheus/prometheus#2820). The Splunk Otel Collector by default logs scrape failures at the error level.
  • By adding this new configuration we can reduce unnecessary logging noise.

…o support logging scrape failures at different levels
@jvoravong jvoravong requested review from a team as code owners June 13, 2023 15:31
Copy link
Contributor

@crobert-1 crobert-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approval based on failing tests being fixed.

// Control the log level to use if a scrape failure occurs when scraping
// a target. Modifying this configuration is useful for less stable
// targets. Only the debug, info, warn, and error log levels are supported.
ScrapeFailureLogLevel string `yaml:"scrapeFailureLogLevel" default:"error"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any README with config explanations and examples? I know this is a smartagent receiver so we may not be as open with usage, but if we're adding a config option it makes sense to me at least document it somewhere as well. I'm up for discussion though, I don't have as much context for this as others.

It looks like it may be internal/signafx-agent/pkg/monitors/prometheusexporter/metadata.yaml

Copy link
Contributor Author

@jvoravong jvoravong Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering the same about updating documentation and examples. Still looking around. I don't think the documentation from the SA repo has been migrated to this repo.

Copy link
Contributor

@rmfitzpatrick rmfitzpatrick Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make docs and the docs directory weren't ported from the agent project since the go dependencies were the initial concern. Taking some of their doc generation scripts and embedding them as generate directives for <monitor-pkg>/README.md creation* would be my vote (in unrelated changes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would support the make docs option, I can create a gh ticket and supporting materials for this if it is not already captured in another effort.

Would support a similar approach where possible in the chart repo for docs automation.

}

type fetcher func() (io.ReadCloser, expfmt.Format, error)

// Configure the monitor and kick off volume metric syncing
func (m *Monitor) Configure(conf *Config) error {
m.logger = logrus.WithFields(logrus.Fields{"monitorType": m.monitorName, "monitorID": conf.MonitorID})
m.loggerFailureLevel, _ = logrus.ParseLevel(conf.ScrapeFailureLogLevel)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense for this be moved to Validate() and have the set there to not discard its validation behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, Instead of calling twice I think you could add an unexpected logLevel logrus level field that's set in Validate()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this, let me know if you wanted something else.

@rmfitzpatrick
Copy link
Contributor

Please add unit tests

CHANGELOG.md Outdated Show resolved Hide resolved
jvoravong and others added 2 commits June 16, 2023 07:44
Co-authored-by: Ryan Fitzpatrick <10867373+rmfitzpatrick@users.noreply.github.com>
@jvoravong jvoravong merged commit 9650f3e into signalfx:main Jun 16, 2023
@jvoravong jvoravong deleted the OTL-Prometheus-Update branch June 16, 2023 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants