Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/memory_limiter] Negative limit values are not rejected #9060

Closed
dmitryax opened this issue Dec 11, 2023 · 4 comments · Fixed by #9169
Closed

[processor/memory_limiter] Negative limit values are not rejected #9060

dmitryax opened this issue Dec 11, 2023 · 4 comments · Fixed by #9169
Labels
bug Something isn't working help wanted Good issue for contributors to OpenTelemetry Service to pick up processor/memorylimiter

Comments

@dmitryax
Copy link
Member

Describe the bug
Negative limit values in the configuration are accepted and converted into 0xffffffff - val

Steps to reproduce
Use config

  memory_limiter:
    limit_mib: -1
    spike_limit_mib: -2
    check_interval: 5s

What did you expect to see?
Collector refuse to start with an error message.

What did you see instead?
Collector is started with huge limits

2023-12-10T17:17:16.225-0800	info	service@v0.90.1/telemetry.go:86	Setting up own telemetry...
2023-12-10T17:17:16.225-0800	info	service@v0.90.1/telemetry.go:203	Serving Prometheus metrics	{"address": ":8888", "level": "Basic"}
2023-12-10T17:17:16.225-0800	info	exporter@v0.90.1/exporter.go:275	Development component. May change in the future.	{"kind": "exporter", "data_type": "metrics", "name": "debug"}
2023-12-10T17:17:16.226-0800	info	memorylimiterprocessor@v0.90.1/memorylimiter.go:102	Memory limiter configured	{"kind": "processor", "name": "memory_limiter", "pipeline": "metrics", "limit_mib": 4294967295, "spike_limit_mib": 4294967294, "check_interval": 5}

What version did you use?
Version: v0.91.1

@dmitryax dmitryax added bug Something isn't working processor/memorylimiter help wanted Good issue for contributors to OpenTelemetry Service to pick up labels Dec 11, 2023
@crobert-1
Copy link
Member

crobert-1 commented Dec 19, 2023

Investigation
I was able to look into this a bit. The conversion instead of failure is happening in the mapstructure library because we're setting WeaklyTypedInput to true. The description of the impact of WeaklyTypedInput can be found here.

I was able to confirm setting WeaklyTypedInput:false caused a failure in the config decoding:

2023/12/18 15:53:13 collector server run finished with error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'processors': error reading configuration for "memory_limiter": 1 error(s) decoding:

* cannot parse 'limit_mib', -100 overflows uint

This unfortunately puts us in a bit of a bind. The data is still largely unstructured and untyped before the call to decode, so it would be pretty complicated to add validation before decoding. However, once data is decoded it's impossible to know if the value was originally a large uint value, or just negative.

We could possibly disable WeaklyTypedInput, but I would be much more concerned about breaking existing configurations with that change, as well as disallowing configurations we want to accept.

Impact
Here's the full list of uint* configuration options in the core and contrib distributions
Core:

.//extension/ballastextension/config.go:	SizeMiB uint64 `mapstructure:"size_mib"`
.//extension/ballastextension/config.go:	SizeInPercentage uint64 `mapstructure:"size_in_percentage"`
.//processor/memorylimiterprocessor/config.go:	MemoryLimitMiB uint32 `mapstructure:"limit_mib"`
.//processor/memorylimiterprocessor/config.go:	MemorySpikeLimitMiB uint32 `mapstructure:"spike_limit_mib"`
.//processor/memorylimiterprocessor/config.go:	MemoryLimitPercentage uint32 `mapstructure:"limit_percentage"`
.//processor/memorylimiterprocessor/config.go:	MemorySpikePercentage uint32 `mapstructure:"spike_limit_percentage"`
.//processor/batchprocessor/config.go:	SendBatchSize uint32 `mapstructure:"send_batch_size"`
.//processor/batchprocessor/config.go:	SendBatchMaxSize uint32 `mapstructure:"send_batch_max_size"`
.//processor/batchprocessor/config.go:	MetadataCardinalityLimit uint32 `mapstructure:"metadata_cardinality_limit"`

Contrib:

../opentelemetry-collector-contrib//processor/tailsamplingprocessor/config.go:	NumTraces uint64 `mapstructure:"num_traces"`
../opentelemetry-collector-contrib//processor/tailsamplingprocessor/config.go:	ExpectedNewTracesPerSec uint64 `mapstructure:"expected_new_traces_per_sec"`
../opentelemetry-collector-contrib//processor/probabilisticsamplerprocessor/config.go:	HashSeed uint32 `mapstructure:"hash_seed"`
../opentelemetry-collector-contrib//exporter/splunkhecexporter/config.go:	MaxContentLengthLogs uint `mapstructure:"max_content_length_logs"`
../opentelemetry-collector-contrib//exporter/splunkhecexporter/config.go:	MaxContentLengthMetrics uint `mapstructure:"max_content_length_metrics"`
../opentelemetry-collector-contrib//exporter/splunkhecexporter/config.go:	MaxContentLengthTraces uint `mapstructure:"max_content_length_traces"`
../opentelemetry-collector-contrib//exporter/splunkhecexporter/config.go:	MaxEventSize uint `mapstructure:"max_event_size"`
../opentelemetry-collector-contrib//exporter/pulsarexporter/config.go:	MaxReconnectToBroker            *uint            `mapstructure:"max_reconnect_broker"`
../opentelemetry-collector-contrib//exporter/pulsarexporter/config.go:	BatchingMaxMessages             uint             `mapstructure:"batching_max_messages"`
../opentelemetry-collector-contrib//exporter/pulsarexporter/config.go:	BatchingMaxSize                 uint             `mapstructure:"batching_max_size"`
../opentelemetry-collector-contrib//exporter/sapmexporter/config.go:	NumWorkers uint `mapstructure:"num_workers"`
../opentelemetry-collector-contrib//exporter/sapmexporter/config.go:	MaxConnections uint `mapstructure:"max_connections"`
../opentelemetry-collector-contrib//exporter/clickhouseexporter/config.go:	TTLDays uint `mapstructure:"ttl_days"`
../opentelemetry-collector-contrib//exporter/datadogexporter/config.go:	issueNumber uint
../opentelemetry-collector-contrib//pkg/stanza/adapter/config.go:	maxBatchSize  uint
../opentelemetry-collector-contrib//pkg/stanza/operator/input/namedpipe/config.go:	Permissions uint32       `mapstructure:"mode"`

I'm not sure the impact of this bug justifies changing WeaklyTypedInput to false. I'll do some more digging to see if any valid configurations would be considered invalid by this change, but even then, I'm not sure it's worth a breaking change and potentially breaking valid configurations to handle this case. Input is welcome though, I don't have much experience here.

@crobert-1
Copy link
Member

A potential workaround is to simply move all configuration uint values to int or int64. This would allow us to validate the values properly to make sure there aren't negatives. However, this would be pretty poor best-practices for a lot of types (using signed types for unsigned information), and I don't think it's a great idea to change variable types as a workaround for a third party dependency's functionality, but I'll defer to others.

@dmitryax
Copy link
Member Author

We have a few decode hooks

DecodeHook: mapstructure.ComposeDecodeHookFunc(
. I think we can add another hook for this validation and avoid flipping WeaklyTypedInput

@crobert-1
Copy link
Member

We have a few decode hooks

Thanks for calling that out, that's a much better way to handle this. I missed that in my review.

mx-psi pushed a commit that referenced this issue Apr 19, 2024
**Description:**
This adds a decode hook for unmarshalling negative integers into uint
types. This will now return an error instead of converting negative
values into large uint values.

**Link to tracking Issue:** <Issue number if applicable>
Fixes #9060 

**Testing:**
Added unit tests for confmap functionality, functional tests in memory
limiter processor (the original component this issue was filed against)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Good issue for contributors to OpenTelemetry Service to pick up processor/memorylimiter
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants