Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scrapeFailureLogLevel config to smartagent prometheus receivers #3260

Merged
merged 7 commits into from
Jun 16, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@
that the translation in the prometheus receiver is a subject to possible future changes.
([#23229](https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/23229))

### 💡 Enhancements 💡

- (Splunk) `receiver/smartagent`: Add scrapeFailureLogLevel config to smartagent prometheus receivers to support logging scrape failures at different levels ([#3260](https://github.com/signalfx/splunk-otel-collector/pull/3260))
jvoravong marked this conversation as resolved.
Show resolved Hide resolved

## v0.78.1

### 🧰 Bug fixes 🧰
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,13 +45,26 @@ type Config struct {
// (the default).
MetricPath string `yaml:"metricPath" default:"/metrics"`

// Control the log level to use if a scrape failure occurs when scraping
// a target. Modifying this configuration is useful for less stable
// targets. Only the debug, info, warn, and error log levels are supported.
ScrapeFailureLogLevel string `yaml:"scrapeFailureLogLevel" default:"error"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any README with config explanations and examples? I know this is a smartagent receiver so we may not be as open with usage, but if we're adding a config option it makes sense to me at least document it somewhere as well. I'm up for discussion though, I don't have as much context for this as others.

It looks like it may be internal/signafx-agent/pkg/monitors/prometheusexporter/metadata.yaml

Copy link
Contributor Author

@jvoravong jvoravong Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering the same about updating documentation and examples. Still looking around. I don't think the documentation from the SA repo has been migrated to this repo.

Copy link
Contributor

@rmfitzpatrick rmfitzpatrick Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make docs and the docs directory weren't ported from the agent project since the go dependencies were the initial concern. Taking some of their doc generation scripts and embedding them as generate directives for <monitor-pkg>/README.md creation* would be my vote (in unrelated changes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would support the make docs option, I can create a gh ticket and supporting materials for this if it is not already captured in another effort.

Would support a similar approach where possible in the chart repo for docs automation.


// Send all the metrics that come out of the Prometheus exporter without
// any filtering. This option has no effect when using the prometheus
// exporter monitor directly since there is no built-in filtering, only
// when embedding it in other monitors.
SendAllMetrics bool `yaml:"sendAllMetrics"`
}

func (c *Config) Validate() error {
if _, err := logrus.ParseLevel(c.ScrapeFailureLogLevel); err != nil {
return err
} else {
return nil
}
}

func (c *Config) GetExtraMetrics() []string {
// Maintain backwards compatibility with the config flag that existing
// prior to the new filtering mechanism.
Expand All @@ -74,16 +87,18 @@ type Monitor struct {
// If true, IncludedMetrics is ignored and everything is sent.
SendAll bool

monitorName string
logger logrus.FieldLogger
cancel func()
monitorName string
logger logrus.FieldLogger
loggerFailureLevel logrus.Level
cancel func()
}

type fetcher func() (io.ReadCloser, expfmt.Format, error)

// Configure the monitor and kick off volume metric syncing
func (m *Monitor) Configure(conf *Config) error {
m.logger = logrus.WithFields(logrus.Fields{"monitorType": m.monitorName, "monitorID": conf.MonitorID})
m.loggerFailureLevel, _ = logrus.ParseLevel(conf.ScrapeFailureLogLevel)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense for this be moved to Validate() and have the set there to not discard its validation behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, Instead of calling twice I think you could add an unexpected logLevel logrus level field that's set in Validate()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this, let me know if you wanted something else.


var bearerToken string

Expand Down Expand Up @@ -136,7 +151,8 @@ func (m *Monitor) Configure(conf *Config) error {
utils.RunOnInterval(ctx, func() {
dps, err := fetchPrometheusMetrics(fetch)
if err != nil {
m.logger.WithError(err).Error("Could not get prometheus metrics")
// The default log level is error, users can configure which level to use
m.logger.WithError(err).Log(m.loggerFailureLevel, "Could not get prometheus metrics")
return
}

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
package prometheusexporter

import (
"errors"
"testing"
)

func TestConfigValidate(t *testing.T) {
tests := []struct {
name string
scrapeFailureLevel string
expectedError error
}{
{
name: "Valid log level: debug",
scrapeFailureLevel: "debug",
expectedError: nil,
},
{
name: "Valid log level: info",
scrapeFailureLevel: "info",
expectedError: nil,
},
{
name: "Valid log level: warn",
scrapeFailureLevel: "warn",
expectedError: nil,
},
{
name: "Valid log level: error",
scrapeFailureLevel: "error",
expectedError: nil,
},
{
name: "Invalid log level",
scrapeFailureLevel: "badValue",
expectedError: errors.New("not a valid logrus Level: \"badValue\""),
},
}

for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
config := &Config{
ScrapeFailureLogLevel: test.scrapeFailureLevel,
}

err := config.Validate()

if test.expectedError != nil {
if err == nil {
t.Errorf("Expected error '%s', but got nil", test.expectedError.Error())
} else if err.Error() != test.expectedError.Error() {
t.Errorf("Expected error '%s', but got '%s'", test.expectedError.Error(), err.Error())
}
} else if err != nil {
t.Errorf("Expected no error, but got '%s'", err.Error())
}
})
}
}
14 changes: 8 additions & 6 deletions pkg/receiver/smartagentreceiver/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -132,9 +132,10 @@ func TestLoadConfig(t *testing.T) {
HTTPConfig: httpclient.HTTPConfig{
HTTPTimeout: timeutil.Duration(10 * time.Second),
},
Host: "localhost",
Port: 5309,
MetricPath: "/metrics",
Host: "localhost",
Port: 5309,
MetricPath: "/metrics",
ScrapeFailureLogLevel: "error",
},
acceptsEndpoints: true,
}, etcdCfg)
Expand Down Expand Up @@ -328,9 +329,10 @@ func TestLoadConfigWithEndpoints(t *testing.T) {
HTTPConfig: httpclient.HTTPConfig{
HTTPTimeout: timeutil.Duration(10 * time.Second),
},
Host: "localhost",
Port: 5555,
MetricPath: "/metrics",
Host: "localhost",
Port: 5555,
MetricPath: "/metrics",
ScrapeFailureLogLevel: "error",
},
acceptsEndpoints: true,
}, etcdCfg)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@ receivers:
intervalSeconds: 1
host: "localhost"
port: 8000
scrapeFailureLogLevel: debug
exporters:
otlp:
endpoint: "${OTLP_ENDPOINT}"
tls:
insecure: true

service:
pipelines:
metrics:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,12 @@ receivers:
port: 8889
extraDimensions:
foo: bar

scrapeFailureLogLevel: error
exporters:
otlp:
endpoint: "${OTLP_ENDPOINT}"
tls:
insecure: true

service:
telemetry:
metrics:
Expand Down