-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kong 3.7.1 Exponential memory growth over time. #13553
Comments
Hi @muh-kamran !
Sharing this data will allow us to create a similar environment to yours and potentially replicate this issue. |
we are using kong as ingress as well as gateway. In gateway we push declarative configuration through admin api.The issue is on both but very aggressive on kong ingress. let share the detail of ingress. Helm chart: https://github.com/Kong/charts/tree/main/charts/kong
One thing to note this configuration have impact on memory:
when enable_reverse_sync: true and PROXY_SYNC_SECONDS: 30 .. the proxy got OOM killed immediately in few hours. setting it to enable_reverse_sync: false and PROXY_SYNC_SECONDS: 600 or so .. The memory grow slowly and proxy got OOM in days. Could be memory leak on configuration push to admin api. Plugins enabled:
services: 70 |
We found similar issue kong 3.0, just by frequent GET requests to admin API. But it was fixed in 3.2 #10782 |
hi @muh-kamran We need some detailed information to help us identify where the memory consumption is happening.
or any information which you think is helpful for troubleshooting |
The growth here doesn't look exponential to me, but quite linear. Also at the end it seems to drop (is that the OOM kill). Am I correct? OOM kill seems to happen pretty close to 1.1GB whereas 3.3 you got somewhere around ~1GB? Or do you mean the hikes in graph look like double each time? |
I'll provide the requested details. However, before doing so, I was testing it on a local kind cluster and observed the same pattern with a single ingress object. Here are the steps to reproduce:
|
curl localhost:8001/status
cat /proc/worker1/maps (there are two workers)
pmap -d 1347
|
Hi @nowNick ,I was wondering if you’ve identified the issue or if it’s still unknown. |
Hi @muh-kamran, this is still under investigation. |
We're seeing the same issue albeit smaller leaks of about ~2-22MB for every one of our calls through the KONG gateway. We're on 3.6.1. We are using the grpc plugin and a custom grpc-request-transfomer. If we let the system run for a long time we get the OOMKilled, but only allocating 2Gi for the memory and 2 cpus. @nowNick if you need some extra logs just let us know as we're seeing the request-transformer is constantly being called before it dies like this:
|
Internal tracking code: KAG-5409 |
Just to add more information, I setup my system last night on my desktop machine using minikube with 8 CPUs and 16GB of memory and just installed kong, our instrumentation system and one additional pod. I took a memory snapshot before I went to bed, and just now. System was idle all night. Kong version 3.6.1
|
From your provided information int this comment: #13553, I guess there might be some memory leaks inside nginx core. The memory usage from lua VM appears to be quite small.
Could you dump this segment of memory as some strings , for example, you could use gdb like this:
If you're fortunate enough that a specific string is leaking, the information provided by the string can help us further deduce which module is causing the leak. |
Is there an existing issue for this?
Kong version (
$ kong version
)3.7.1
Current Behavior
We recently upgraded from 3.3 to 3.71. In last 8 days the memory of proxy grew from 550MB to 1.03GB and eventually got OOM killed. we didn't saw this issue on 3.3.
Nothing changed from our end.. just version upgrade.
Please inform me if you need more details. Additionally, I would appreciate any recommendations for debugging steps, areas to investigate, or configuration changes that have successfully addressed similar issues for you.
Expected Behavior
kong 3.3 its much stable:
Steps To Reproduce
No response
Anything else?
only error message in logs i see is:
[error] 1437#0: *582289 [lua] http.lua:962: request_uri(): closed, context: ngx.timer
The text was updated successfully, but these errors were encountered: