-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[opentelemetry-collector] Remove memory_ballast
extension from default template and replace by GOMEMLIMIT
#891
Comments
@mx-psi @dmitryax i think we should implement this via a feature-gate-like config. The first PR can add the option to use the Go setting instead of the extension, but disabled by default. After some time, we can enable it by default, but still allow users who want to use the extension to disable it. Then we'd wait until the extension is deprecated to remove the option. |
Have we done anything similar before on the Collector Helm chart? The plan sounds reasonable to me |
@TylerHelmuth should this be a usual configuration option or is there an alternative mechanism in the Helm chart for temporary toggles? |
We dont have anything like feature gates in the chart, we normally do it via a configuration option (or 2). |
Question: what are we planning to do we do when request limits are not set? ATM we default memorybalast to (https://github.com/open-telemetry/opentelemetry-helm-charts/blob/main/charts/opentelemetry-collector/templates/_config.tpl#L25C1-L27C1) :
and requests / limits to empty:
I think the only thing we can do is not set the This would be a different behaviour from current one. |
I am personally fine with that, if you explicitly unset resource limits then you should be on your own to configure this |
I have started work on this |
How much should we wait until we enable the GOMEMLIMIT option by default? Is something like one month enough? |
@mx-psi is this helm chart feature critical for the decision on whether or not to deprecate the extension or has the decision already been made to deprecate it? |
I would want to validate the approach with the Helm chart before committing to deprecate the extension. I don't feel like I personally have enough knowledge about the Go garbage collector to ensure this replacement will be satisfactory for all use cases (it probably will, that's why I pushed for this, but I want to be sure before deprecating the extension). If anybody else feels like this is clear cut, then we don't have to wait for this, but personally I feel like this is safer. |
In that case I think we need to leave it around as long as it take to make a decision. We won't move forward with turning it always on until we're sure we are deprecating the extension. I can enable it on our internal use of the collector, but we'll need other examples as well. |
@JaredTan95 based on #912 it seems like you are testing this out. Would you mind sharing any findings about the setting (memory and CPU usage patterns before/after?) I also posted a call out for more testing on a few channels on the CNCF Slack. Once we have at least two I think we can proceed with the deprecation and enable this by default. |
We have been running Collectors using the Helm chart with useGOMEMLIMIT enabled for a few weeks on Datadog's infrastructure with a similar CPU usage as with the memory ballast and lower memory consumption (because of the memory ballast). |
Can you share some stats/numbers on the improvements? |
We see similar stats at Honeycomb |
We switched to this(useGOMEMLIMIT) configuration because of an issue with So, I support it. |
@mx-psi I'd still like to see the extension officially deprecated in Core before we make this a default value in the helm chart. |
@JaredTan95 can you please share more details about memory ballast's compatibility issues with our arm architecture? It be nice to post them in open-telemetry/opentelemetry-collector#8343 as well |
Based on the discussion on open-telemetry/opentelemetry-collector#8343 and this issue, I filed open-telemetry/opentelemetry-collector#8803 to deprecate the memory ballast extension. |
I'd prefer the next step before deprecating the |
I am fine with that, although I think @TylerHelmuth wanted to go the other way around 😄 Up to the Helm chart maintainers I guess :) |
Ya I was imagining after the initial testing that the helm chart would follow the guidance from Core. My though being that I didn't want to disrupt users until the component was actually deprecated. But we could enable it by default and use helm chart users as testers (unless they switch it back to false). |
That's what I thought. I wanted to gather as much feedback as possible before the deprecation. I'm thinking of |
I agree! 👏 |
The compatibility problem we encountered was only tested on the Chinese operating system(Kylin OS), so I think it has little reference value for users using international operating systems, so I did not put it up :-P |
**Description:** Based on user reports on open-telemetry/opentelemetry-helm-charts/issues/891 and the discussion on #8343, we can deprecate the memory ballast extension in favor of using `GOMEMLIMIT`. This PR: - Deprecates the memory ballast extension in the README - Removes references to the memory ballast extension on docs - Updates k8s example to use `GOMEMLIMIT` with the same approach as in the Helm chart (80% of memory limit) - Deprecates the memory ballast extension Go module Once this PR is accepted, open-telemetry/opentelemetry-helm-charts/issues/891 can move ahead with enabling `useGOMEMLIMIT` by default on the Helm chart. Other issues will be opened for opentelemetry.io, the Opentelemetry Operator and other parts of the OpenTelemetry project to remove references to this extension once the PR is merged. No explicit timeline is given for removal of the extension. **Link to tracking Issue:** Updates #8343 --------- Co-authored-by: Bogdan Drutu <bogdandrutu@gmail.com>
open-telemetry/opentelemetry-collector/pull/8803 got merged, we can revert before Jan 8th if we find issues |
We want to remove the
memory_ballast
extension in favor of Go 1.19+GOMEMLIMIT
environment variable as stated in open-telemetry/opentelemetry-collector/issues/8343.The first step to verify that we can effectively remove the extension is to change this on the OpenTelemetry Collector Helm chart.
We can replace the extension by a value of
GOMEMLIMIT
that is something like 80-90% of the total memory limit for the Collector container. Once we release this change and wait some time to gather feedback, we can move on with deprecation of the memory ballast extension.The text was updated successfully, but these errors were encountered: