-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metric_version = 1 memory leak #9821
Comments
Please help to add the following code at v1 collector.go Add function. You can reference v2 collector Add function. // Expire metrics, doing this on Add ensure metrics are removed even if no
// one is querying the data.
c.Expire(time.Now(), c.ExpirationInterval) |
Currently, we are only expiring data is someone is getting the data. This means that if data is continiously pushed, but not gathered, the usage can grow and grow. This change forces expiration during add, similar to how v2 handles this as well. fixes: influxdata#9821
Hi, Expire is currently only called during I have put up #12160 with a fix to add the expire call during add as well. In 20-30mins, there will be some test artifacts attached that PR. Could someone please download that artifact and confirm the lower memory usage? If you do run into any issues, can you provide me with an example config + the metrics pushed to prometheus input? Thanks! |
Relevant telegraf.conf:
[agent]
interval = "1s"
round_interval = false
metric_batch_size = 800
metric_buffer_limit = 1000
collection_jitter = "0s"
flush_interval = "1s"
flush_jitter = "0s"
hostname = ""
omit_hostname = true
debug = true
quiet = false
logfile = "logger"
[[outputs.prometheus_client]]
listen = ":9276"
path = "/metrics"
expiration_interval = "20s"
export_timestamp = false
metric_version = 1
[[inputs.socket_listener]]
service_address = "udp://:8094"
read_buffer_size = "8MB"
data_format = "prometheus"
System info:
telegraf-1.19.3_windows_amd64
Windows Server 2019 Datacenter
Steps to reproduce:
The input was about 800 metrics sent to the UDP input plugin per second, the prometheus output exposed them and they were forwarded, but the telegraf process memory exhibited linear growth - reaching 4GB memory used by the process after running of an hour.
Setting the metric_version to '2' fixed this - and the process memory remained at about 50 mbs while running for a couple hours.
The text was updated successfully, but these errors were encountered: