-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load-balancing exporter k8s resolver continuously invokes the OnUpdate() command in the handler #35658
Labels
Comments
Tarmander
added
bug
Something isn't working
needs triage
New item requiring triage
labels
Oct 7, 2024
This was referenced Oct 15, 2024
This was referenced Oct 29, 2024
jpkrohling
pushed a commit
that referenced
this issue
Nov 27, 2024
…e update events (#36505) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description The load balancing exporter's k8sresolver was not handling update events properly. The `callback` function was being executed after cleanup of old endpoints and also after adding new endpoints. This causes exporter churn in the case of an event in which the lists contain shared elements. See the [documentation](https://pkg.go.dev/k8s.io/client-go/tools/cache#ResourceEventHandler) for examples where the state might change but the IP Addresses would not, including the regular re-list events that might have zero changes. <!-- Issue number (e.g. #1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes #35658 May be related to #35810 as well. <!--Describe what testing was performed and which tests were added.--> #### Testing Added tests for no-change onChange call. <!--Please delete paragraphs that you did not use before submitting.-->
shivanthzen
pushed a commit
to shivanthzen/opentelemetry-collector-contrib
that referenced
this issue
Dec 5, 2024
…e update events (open-telemetry#36505) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description The load balancing exporter's k8sresolver was not handling update events properly. The `callback` function was being executed after cleanup of old endpoints and also after adding new endpoints. This causes exporter churn in the case of an event in which the lists contain shared elements. See the [documentation](https://pkg.go.dev/k8s.io/client-go/tools/cache#ResourceEventHandler) for examples where the state might change but the IP Addresses would not, including the regular re-list events that might have zero changes. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes open-telemetry#35658 May be related to open-telemetry#35810 as well. <!--Describe what testing was performed and which tests were added.--> #### Testing Added tests for no-change onChange call. <!--Please delete paragraphs that you did not use before submitting.-->
ZenoCC-Peng
pushed a commit
to ZenoCC-Peng/opentelemetry-collector-contrib
that referenced
this issue
Dec 6, 2024
…e update events (open-telemetry#36505) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description The load balancing exporter's k8sresolver was not handling update events properly. The `callback` function was being executed after cleanup of old endpoints and also after adding new endpoints. This causes exporter churn in the case of an event in which the lists contain shared elements. See the [documentation](https://pkg.go.dev/k8s.io/client-go/tools/cache#ResourceEventHandler) for examples where the state might change but the IP Addresses would not, including the regular re-list events that might have zero changes. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes open-telemetry#35658 May be related to open-telemetry#35810 as well. <!--Describe what testing was performed and which tests were added.--> #### Testing Added tests for no-change onChange call. <!--Please delete paragraphs that you did not use before submitting.-->
sbylica-splunk
pushed a commit
to sbylica-splunk/opentelemetry-collector-contrib
that referenced
this issue
Dec 17, 2024
…e update events (open-telemetry#36505) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description The load balancing exporter's k8sresolver was not handling update events properly. The `callback` function was being executed after cleanup of old endpoints and also after adding new endpoints. This causes exporter churn in the case of an event in which the lists contain shared elements. See the [documentation](https://pkg.go.dev/k8s.io/client-go/tools/cache#ResourceEventHandler) for examples where the state might change but the IP Addresses would not, including the regular re-list events that might have zero changes. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes open-telemetry#35658 May be related to open-telemetry#35810 as well. <!--Describe what testing was performed and which tests were added.--> #### Testing Added tests for no-change onChange call. <!--Please delete paragraphs that you did not use before submitting.-->
AkhigbeEromo
pushed a commit
to sematext/opentelemetry-collector-contrib
that referenced
this issue
Jan 13, 2025
…e update events (open-telemetry#36505) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description The load balancing exporter's k8sresolver was not handling update events properly. The `callback` function was being executed after cleanup of old endpoints and also after adding new endpoints. This causes exporter churn in the case of an event in which the lists contain shared elements. See the [documentation](https://pkg.go.dev/k8s.io/client-go/tools/cache#ResourceEventHandler) for examples where the state might change but the IP Addresses would not, including the regular re-list events that might have zero changes. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes open-telemetry#35658 May be related to open-telemetry#35810 as well. <!--Describe what testing was performed and which tests were added.--> #### Testing Added tests for no-change onChange call. <!--Please delete paragraphs that you did not use before submitting.-->
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Component(s)
exporter/loadbalancingexporter
What happened?
Description
When configuring our load-balancing collector to target our backend collectors via the k8s resolver, we noticed that while the DNS resolution worked fine and the collectors received evenly distributed traffic, the load-balancer would consistently recycle the endpoints at a set cadence (around every 3 minutes). The endpoints would be unchanged.
We added some log statements to the k8s resolver/handler, and they revealed that the
OnUpdate()
function in the handler was being invoked. This would imply that some event was triggering the update, butk get endpoints opentelemetry-global-gateway-collector --watch --output-watch-events=true
returned no events for several hours when ran manually.The net result was no actual changes to the service endpoints, but the exporter would consistently dispose and construct new exporters.
Steps to Reproduce
Configure the k8s resolver to point to a service representing
Expected Result
The
OnUpdate()
call in k8s handler only runs when updates occur in the service endpoints pointed to by the k8s resolver.Actual Result
OnUpdate()
is invoked at a recurring frequency of around every 3 minutes, regardless of changes to the service it points to.Collector version
v0.105.0
Environment information
Environment
OS: Ubuntu 22.04
Compiler: go1.22.6
OpenTelemetry Collector configuration
Log output
Sample Log Output:
Additional context
No response
The text was updated successfully, but these errors were encountered: