Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File descriptor leak when using h2 #22

Closed
reynir opened this issue Jan 3, 2024 · 3 comments · Fixed by #23
Closed

File descriptor leak when using h2 #22

reynir opened this issue Jan 3, 2024 · 3 comments · Fixed by #23
Assignees

Comments

@reynir
Copy link
Contributor

reynir commented Jan 3, 2024

I have an application that queries on a regular interval an opam repository. Two instances run, one querying opam.ocaml.org and the other opam.robur.coop. The former is leaking file descriptors while the latter doesn't.

I need to verify, but running curl -v https://opam.{ocaml.org,robur.coop} curl at least uses h2 for opam.ocaml.org and http/1.1 for opam.robur.coop, and I suspect it is the same for http-lwt-client.

@reynir reynir self-assigned this Jan 3, 2024
@reynir
Copy link
Contributor Author

reynir commented Jan 3, 2024

Looking at the code I don't see where the file descriptor is closed. So it may just be opam.robur.coop being more eager to close client connections. I will look more into it tomorrow.

@reynir
Copy link
Contributor Author

reynir commented Jan 4, 2024

The socket is closed in Http_lwt_unix once the reader and writer loops signal they've exited.

Lwt.async (fun () ->
Lwt.join [read_loop_exited; write_loop_exited] >>= fun () ->
match socket with
| `Plain socket ->
if Lwt_unix.state socket <> Lwt_unix.Closed then
Lwt.catch
(fun () -> Lwt_unix.close socket)
(fun _exn -> Lwt.return_unit)
else
Lwt.return_unit
| `Tls t -> Tls_lwt.Unix.close t);

I wrote a crude test (that depends on brittle assumptions): https://github.com/reynir/http-lwt-client/tree/test-fd-leak

What I discovered is that

  • this seems to only occur with h2, and not https/1.1 from testing about half a dozen websites,
  • the read loop and write loop never signal they've exited for h2 (but they do for http/1.1)

@hannesm
Copy link
Contributor

hannesm commented Jan 5, 2024

In http-mirage-client, what we do is:

let single_request
    ~ctx ~alpn_protocol ?config cfg ~meth ~headers ?body uri f f_init =
  ...
  Mimic.resolve ctx >>? fun flow ->
  (match .. with
   | `HTTP2 -> single_h2_request
   | _ -> single_http_1_1_request) >>= fun r ->
  Mimic.close flow >>= fun () ->

And so, the last line does an explicit close of the flow.

With the same reasoning in mind, in http-lwt-client, we do a connect ... >>= fun fd -> <request>, and could after the request explicitly close the fd. In my brain, the best resource management is "the code allocating a resource (file descriptor) is responsible for clearing it up". Certainly there is the danger of double closes (also see https://discuss.ocaml.org/t/detect-use-after-close-of-unix-file-descriptors as related discussion). There's not much we can do to avoid these double close (apart from relying on http/af and h2 to properly close fd's (or our Http_lwt_unix module) -- but it looks like this is not the case :/).

@reynir reynir closed this as completed in #23 Jan 6, 2024
mseri pushed a commit to ocaml/opam-repository that referenced this issue Jan 6, 2024
CHANGES:

* Call shutdown in h2 to properly close http2 connections. This fixes a file
  descriptor leak reported in robur-coop/http-lwt-client#22 (robur-coop/http-lwt-client#23, @reynir @hannesm)
nberth pushed a commit to nberth/opam-repository that referenced this issue Jun 18, 2024
CHANGES:

* Call shutdown in h2 to properly close http2 connections. This fixes a file
  descriptor leak reported in robur-coop/http-lwt-client#22 (robur-coop/http-lwt-client#23, @reynir @hannesm)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants