-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hostmetrics receiver duplicates filesystem metrics on GKE #34512
Comments
Pinging code owners: See Adding Labels via Comments if you do not have permissions to add labels yourself. |
By way of further debugging. Checking /proc/1/mountinfo (used by the imported shirou/goputi), looking for one of the duplicating
The mounts are paths that are mounted to the same location but (I think), under different namespaces, presumably the same data into two different pods. I think it would be valid to one export the metrics once per unique path (they should all have the same filesystem level metrics). Though, equally, it's not obvious that metrics for these mounts are useful at all. I'm working around the issue locally by dropping metrics for these paths, (there's a good chance I'd drop these anyway, they aren't terribly useful), but fixing the duplication in the hostmetrics receiver seems fair. |
Mountpoints can be reported multiple times for each mount into a namespace. This causes duplicate metrics which causes issues with some exporters. Each instance of the mountpoint will have identical metrics, so it is safe to ignore repeated mountpoints. Closes open-telemetry#34512
…elemetry#34635) Mountpoints can be reported multiple times for each mount into a namespace. This causes duplicate metrics which causes issues with some exporters. Each instance of the mountpoint will have identical metrics, so it is safe to ignore repeated mountpoints. Closes open-telemetry#34512 **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> **Link to tracking Issue:** <Issue number if applicable> **Testing:** <Describe what testing was performed and which tests were added.> **Documentation:** <Describe the documentation added.>
…elemetry#34635) Mountpoints can be reported multiple times for each mount into a namespace. This causes duplicate metrics which causes issues with some exporters. Each instance of the mountpoint will have identical metrics, so it is safe to ignore repeated mountpoints. Closes open-telemetry#34512 **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> **Link to tracking Issue:** <Issue number if applicable> **Testing:** <Describe what testing was performed and which tests were added.> **Documentation:** <Describe the documentation added.>
Component(s)
receiver/hostmetrics
What happened?
Description
When running in GKE
system.filesystem.inodes.usage
andsystem.filesystem.usage
duplicate metrics formountpoint=/home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet
, along with other pod specific mountpoints under.../home/kubernetes/containerized_mounter/rootfs/
/var/lib/kubelet/pods/
/var/lib/kubelet/plugins/
Not all pods have the duplicated data, it appears to be more prevalent on pods that are using CSI plugins.
Steps to Reproduce
Expected Result
Metrics should be collected without duplicates.
Actual Result
One of the detected mountpoints appears twice in the metrics. This then causes issues when metrics are passed to external metrics providers like Google.
when coupled with the
googlemanagedprometheus
exporter we get the following error:Collector version
otelcol-contrib version 0.105.0
Environment information
Environment
OS: google container OS
Compiler: official docker container image ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib@sha256:3ff721e65733a9c2d94e81cfb350e76f1cd218964d5608848e2e73293ea88114
OpenTelemetry Collector configuration
Log output
See "what happened"
Additional context
No response
The text was updated successfully, but these errors were encountered: