Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPv6 support #5

Closed
gilou opened this issue Jan 19, 2022 · 14 comments
Closed

IPv6 support #5

gilou opened this issue Jan 19, 2022 · 14 comments

Comments

@gilou
Copy link

gilou commented Jan 19, 2022

As mentionned in savonet/liquidsoap#1425 ocaml-cry seems to be misbehaving when it comes to IPv6.

Using an explicit IPv6, this happens, as if it was trying to resolve it… great.

2022/01/18 23:29:49 [ice:3] Connecting mount test for source@2001:41d0:d:2147::106...                              
2022/01/18 23:29:49 [ice:2] Connection failed: Not_found                                                               
2022/01/18 23:29:49 [ice:3] Will try again in 3.00 sec.

If the name is ipv4/ipv6 resolvable it ignores the IPv6, which is … not what is supposed to happen.

I can provide an IPv6 reachable client/server if it helps.

I'm vaguely guessing something in connect or in unix_transport is fishy, but I'm reaaaally not good at OCAML :P

@gilou
Copy link
Author

gilou commented Jan 19, 2022

@jcourreges is helping patching this, seems like the whole usage of sockaddr through gethostbyname is unfortunate, and it might be that ocaml-ssl is also responsible for bad behaviour here.

I incorporated his work so far on https://github.com/gilou/ocaml-cry/tree/ipv6 (diff @ master...gilou:ipv6)

@toots
Copy link
Member

toots commented Jan 20, 2022

Thanks! This seems to be going the right direction. Keep us posted, I'll try to get to it when I have time.

@toots
Copy link
Member

toots commented Jan 22, 2022

Fixed in #6!

@toots toots closed this as completed Jan 22, 2022
@gilou
Copy link
Author

gilou commented Jan 6, 2024

OK, I know we've closed that, but … As I was playing around, I tested liq 2.2 / ocaml-cry 1.0.2 … so:

  • using a hostname, resolving to IPv4 only, or directly an IPv4, fine
  • using a hostname, resolving to IPv6 only, or directly an IPv6, fine
  • using a hostname offering A+AAAA, works, but selects IPv4 by default, when it shouldn't.

That is, using a dual stack host, totally randomly e.g. stream.wolface.net, that one has A / AAAA records. output.icecast in liq connects using IPv4 from a dual stack client. This can be tested easily using /etc/hosts on Linux at least.

This is not blocking, but it's not exactly nice either, as on Linux, it should conform to getaddrinfo(3) settings in /etc/gai.conf, which should prioritize IPv6 by default…

@smimram smimram reopened this Jan 7, 2024
@jcourreges
Copy link
Contributor

jcourreges commented Jan 7, 2024

Looks like a deliberate choice introduced in 6671b4a. I don't know the rationale behind it.

I agree that it's generally better for applications to honor gai.conf and similar mechanisms. Application-specific settings can then favor/require v4/v6.

The canonical pattern for using getaddrinfo(3) is a loop over a list of addresses, until it gets a valid socket. Why is it a problem for v4 being preferred over v6 and reversely for v6 over v4? Doesn't the code attempt using next addresses in case of failure? Or is it more about special cases, like IPv6 being broken (unusable, times out), or being cheaper and thus more interesting than IPv41?

Footnotes

  1. some hosting providers make you pay extra for IPv4

@toots
Copy link
Member

toots commented Jan 7, 2024

Well this is why I try to be super verbose in my commit comments these days but I guess I miss the mark here and now I have only vague ideas why.

What I seem to remember is that, specifically for localhost, the resolution would pick the ipv6 localhost address and either hang or fail.

I cannot reproduce at the moment, however. I think I'll revert the changes and will add a configuration option.

@toots toots closed this as completed in c41bb44 Jan 7, 2024
@toots
Copy link
Member

toots commented Jan 9, 2024

Ok I think I have a live example of the ipv4 preference issue: savonet/liquidsoap#3610 (comment)

@gilou
Copy link
Author

gilou commented Jan 10, 2024

Well… not sure about that, if their machine has IPv6 connectivity, but can't reach the IPv6 servers… I'd say they have something else to fix (or it could call for a fallback). It's not good to bury that kind of decision in the last piece of the chain ;)

$ host omfm.ru
omfm.ru has address 90.156.228.82
omfm.ru has IPv6 address 2a03:6f00:8::2a49

Could be a lot of issues there…

@toots
Copy link
Member

toots commented Jan 10, 2024

Well I'm in the business of making things work for most users out of the box and giving alternative options for others so I think that, for now, we'll prefer ipv4 until there is a clear reason to change it.. 🙂

@jcourreges
Copy link
Contributor

Well… not sure about that, if their machine has IPv6 connectivity, but can't reach the IPv6 servers… I'd say they have something else to fix (or it could call for a fallback). It's not good to bury that kind of decision in the last piece of the chain ;)

I second this.

$ host omfm.ru
omfm.ru has address 90.156.228.82
omfm.ru has IPv6 address 2a03:6f00:8::2a49

Could be a lot of issues there…

IPv4:

PORT     STATE  SERVICE
22/tcp   open   ssh
80/tcp   open   http
443/tcp  open   https
1935/tcp open   rtmp
8000/tcp open   http-alt
8443/tcp open   https-alt
9999/tcp closed abyss

IPv6:

PORT     STATE  SERVICE
22/tcp   open   ssh
80/tcp   open   http
443/tcp  open   https
1935/tcp closed rtmp
8000/tcp closed http-alt
8443/tcp closed https-alt
9999/tcp closed abyss

There's definitely a difference between IPv4 and IPv6 on the server side, it's not specific to the client, and it's not specific to liquidsoap. However, the getaddrinfo loop should survive to an ECONNREFUSED error, so something is wrong in the current code and it should be fixed. This is not IPv6 specific either, IPv4 would show the same problem.

@toots
Copy link
Member

toots commented Jan 10, 2024

Well I think the question we have to answer is: given the existence of both ipv4 and ipv6 address resolutions, which one is the most likely to succeed? It doesn't take a lot of research to figure out that it is in face ipv4.

I have two examples here and you can see a long list here related to when node changed their default address resolution.

Then, there's also docker whose support for ipv6 is still recent and does not work on windows. Docker is used by a lot of our users too.

Looking at the code it is in fact trying all addresses in order:

  let rec connect_any ?bind_address ?timeout (addrs : Unix.addr_info list) =
    match addrs with
      | [] -> raise Not_found
      | [addr] ->
          (* Let a possible error bubble up *)
          connect_sockaddr ?bind_address ?timeout addr.ai_addr
      | addr :: tail -> (
          try connect_sockaddr ?bind_address ?timeout addr.ai_addr
          with _ -> connect_any ?bind_address ?timeout tail)
  in
  connect_any ?bind_address ?timeout (resolve_host ~prefer host port)

The error is actually raised during a Unix.write later on in the application.

I'm not sure what's causing a ECONNREFUSED to be thrown during a write. Here's the code trying to connect:

let connect_sockaddr ?bind_address ?timeout sockaddr =
  let domain = Unix.domain_of_sockaddr sockaddr in
  let socket = Unix.socket ~cloexec:true domain Unix.SOCK_STREAM 0 in
  (try
     match bind_address with
       | None -> ()
       | Some s -> Unix.bind socket (sockaddr_of_address s)
   with exn ->
     let bt = Printexc.get_raw_backtrace () in
     begin
       try Unix.close socket with _ -> ()
     end;
     Printexc.raise_with_backtrace exn bt);
  let do_timeout = timeout <> None in
  let check_timeout () =
    match timeout with
      | Some timeout ->
          (* Block in a select call for [timeout] seconds. *)
          let _, w, _ = select [] [socket] [] timeout in
          if w = [] then raise Timeout;
          Unix.clear_nonblock socket;
          socket
      | None -> assert false
  in
  let finish () =
    try
      if do_timeout then Unix.set_nonblock socket;
      Unix.connect socket sockaddr;
      if do_timeout then Unix.clear_nonblock socket;
      socket
    with
      | Unix.Unix_error (Unix.EINPROGRESS, _, _) -> check_timeout ()
      | Unix.Unix_error (Unix.EWOULDBLOCK, _, _) when Sys.os_type = "Win32" ->
          check_timeout ()
  in
  try finish ()
  with e ->
    let bt = Printexc.get_raw_backtrace () in
    begin
      try Unix.close socket with _ -> ()
    end;
    Printexc.raise_with_backtrace e bt

@toots
Copy link
Member

toots commented Jan 10, 2024

Ok. I think I found the issue: we were not properly checking for errors when connecting in unblocking mode: 8afc835

@jcourreges
Copy link
Contributor

Ok. I think I found the issue: we were not properly checking for errors when connecting in unblocking mode: 8afc835

Heh, we arrived to the same conclusion and the same code! :)

FWIW I had trouble confirming that checking the socket error improved things, since on OpenBSD I hit the if w = [] then raise Timeout case instead of being handed an unusable socket.

I'm definitely not convinced by the IPv4 works better, let's favor it argument, it certainly depends on the actual context. On some networks IPv4 is definitely worse than IPv6 (NAT that induces timeouts, encapsulation, etc). Besides, it has a self-fulfilling quality, since forcefully neglecting IPv6 means that IPv6-specific issues are fixed more slowly. Here, respecting the system defaults helped fix an issue in the code, and uncovered some probable misconfiguration at some radio provider. ;)

Anyway, I'll stop the rant here, I'm starting to sound like a zealot. You're thinking about your users first, and I totally respect that. Cheers,

@toots
Copy link
Member

toots commented Jan 11, 2024

No that makes sense. I'm glad we got to the bottom of it, @gilou was right there was something underlying it. I have a build reverting back to system default with the fix above, I'll ask the user with the error to test it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants