Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CrashLoopBackOff in cartservice when fs.inotify.max_user_instances is set to a relatively low number #1310

Closed
aknot242 opened this issue Dec 21, 2023 · 7 comments · Fixed by #1312
Labels
bug Something isn't working

Comments

@aknot242
Copy link

aknot242 commented Dec 21, 2023

Bug Report

Which version of the demo you are using? (please provide either a specific
commit
hash

or a specific
release).

Helm chart version: 0.26.0

Symptom

When deploying the opentelemetry-demo to Kubernetes using the Helm chart, the cart service has a watcher responsible for reloading when files are changed. In host systems where the fs.inotify.max_user_instances is set to a relatively low value, the container will crash loop with a log message similar to the following:

Unhandled exception. System.IO.IOException: The configured user limit (128) on the number of inotify instances has been reached, or the per-process limit on the number of open file

This issue was originally posted to the Helm chart repo, and the reviewers asked that the issue be created here instead.

What is the expected behavior?

The cartservice container is running

What is the actual behavior?

The cartservice is failing with a state of CrashLookBackOff

Reproduce

Provision a Linux VM containing a default value for fs.inotify.max_user_instances that is a relatively low value such as 128.
Or, explicitly set this value to a low number using something like:

echo fs.inotify.max_user_instances=128 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p

Install a version of Kubernetes that is supported by the OTEL Demo Helm chart.

Deploy the OTEL Demo using its Helm Chart.

@aknot242
Copy link
Author

Potential fix/workaround:

Since containers deployed via Helm are typically not for development purposes, there may not be a need to watch for changing files at all. According to this, the watching can be disabled altogether via an environment variable:

DOTNET_HOSTBUILDER__RELOADCONFIGONCHANGE=false

@julianocosta89
Copy link
Member

I could reproduce the error, and the suggestion fixed.
@aknot242 would you like to send the PR with the fix?

@puckpuck
Copy link
Contributor

Should we build this directly into the Dockerfile or as part of docker-compose.yml? I'm leaning towards doing this in the Dockerfile itself.

@aknot242
Copy link
Author

If the fix is only implemented in docker-compose.yml, the users of the Helm chart won't benefit from the fix. I would think it should be in the Dockerfile.

@julianocosta89
Copy link
Member

Setting in the Dockerfile or directly in the code also sounds like a better approach to me.

@puckpuck
Copy link
Contributor

@aknot242 I don't have a great way to test this. Please build from my branch in #1312 and test on your cluster to validate that the fix works.

@aknot242
Copy link
Author

aknot242 commented Dec 22, 2023

@puckpuck fix validated. I built the container using your branch, built the container, and deployed it to a cluster whose host's value was set to fs.inotify.max_user_instances = 128.

Deployed successfully; no CrashLoopBackOff.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants