Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File descriptor leak caused by clients causing system to get down #377

Closed
tusharmakkar08 opened this issue Feb 19, 2019 · 6 comments
Closed

Comments

@tusharmakkar08
Copy link
Contributor

I am facing this issue with prometheus-client==0.5.0 and Python 3.5.2. I am using start_http_server command to start the server.

To be precise my close_wait connections keep on increasing till my server stops serving requests.

x@ip-10-0-0-0:~$ lsof -i | grep 8002
python 10601 x 3u IPv4 78413488 0t0 TCP *:8002 (LISTEN)
python 10601 x 55u IPv4 82060563 0t0 TCP ip-10-0-0-0.ap-south-1.compute.internal:8002->ip-172-00-00-00.ap-south-1.compute.internal:45114 (CLOSE_WAIT)
python 10601 x 80u IPv4 82089551 0t0 TCP ip-10-0-0-0.ap-south-1.compute.internal:8002->ip-172-00-00-00.ap-south-1.compute.internal:45976 (CLOSE_WAIT)
python 10601 x 81u IPv4 82088475 0t0 TCP ip-10-0-0-0.ap-south-1.compute.internal:8002->ip-172-00-00-00.ap-south-1.compute.internal:46006 (CLOSE_WAIT)
python 10601 x 82u IPv4 82089875 0t0 TCP ip-10-0-0-0.ap-south-1.compute.internal:8002->ip-172-00-00-00.ap-south-1.compute.internal:46034 (CLOSE_WAIT)
python 10601 x 83u IPv4 82092200 0t0 TCP ip-10-0-0-0.ap-south-1.compute.internal:8002->ip-172-00-00-00.ap-south-1.compute.internal:46064 (CLOSE_WAIT)

image

This issue is similar to prometheus/jmx_exporter#327

Attaching strace here for the issue. https://ideone.com/0VORDa

@brian-brazil
Copy link
Contributor

I don't see anything there related to Prometheus. Why do you think the issue is with the python client?

That's also not a complete strace.

@tusharmakkar08
Copy link
Contributor Author

I think the issue is due to start_http_server functionality in the client. When I am running it via third-party http server (wsgi) there is no such issue.

@brian-brazil
Copy link
Contributor

Can you get a full strace of one of those tcp connections?

@tusharmakkar08
Copy link
Contributor Author

Thing is, as of now, I am unable to reproduce the issue. It happens once in 2 days. I would get strace of those connection whenever the issue occurs again.

@brian-brazil
Copy link
Contributor

Did you manage to get more debug information?

@brian-brazil
Copy link
Contributor

Closing as stale, given no other reports like this I presume the issue is elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants