Skip to content
This repository has been archived by the owner on Nov 27, 2023. It is now read-only.

Bug: massiv latency impact when using dnsname (workaround in comment 3) #55

Closed
dschier-wtd opened this issue Feb 5, 2021 · 13 comments
Closed

Comments

@dschier-wtd
Copy link

dschier-wtd commented Feb 5, 2021

Hi,

thanks for the very cool work and effort you are putting into podman. I have identified some very weird behavior, when using podman in combination with dnsname plugin.

It seems like there is a huge performance impact (x ~150 slower response), when using podman dnsname, instead of IPs or dns servers.

Step by Step

  1. create a test environment (this case rootful)
$ sudo podman network create test01
/etc/cni/net.d/test01.conflist

$ sudo podman network inspect test01 | grep dns

                "domainName": "dns.podman",
                "type": "dnsname"

$ sudo podman container run -dt -P --name web01 --network test01 httpd

$ sudo podman container ls

CONTAINER ID  IMAGE                           COMMAND           CREATED         STATUS             PORTS                  NAMES
3261a7db67f6  docker.io/library/httpd:latest  httpd-foreground  13 seconds ago  Up 12 seconds ago  0.0.0.0:39323->80/tcp  web01
  1. Testing container -> host -> container communication via IP
$ sudo podman container run --rm --network test01 fedora:33 bash -c "time curl 192.168.178.106:39323"

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    45  100    45    0     0  45000      0 --:--:-- --:--:-- --:--:-- 45000
<html><body><h1>It works!</h1></body></html>

real	0m0.004s
user	0m0.001s
sys	0m0.002s
  1. Testing container -> container communication via IP
$ sudo podman inspect web01 | grep IPAddress
            "IPAddress": "",
                    "IPAddress": "10.89.0.11",

$ sudo podman container run --rm --network test01 fedora:33 bash -c "time curl 10.89.0.11"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    45  100    45    0     0  45000      0 --:--:-- --:--:-- --:--:-- 45000
<html><body><h1>It works!</h1></body></html>

real	0m0.004s
user	0m0.000s
sys	0m0.004s
  1. testing container -> host -> container communication via dns
$ sudo podman container run fedora:33 bash -c "time curl nb01:39323"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    45  100    45    0     0  22500    <html><body><h1>It works!</h1></body></html>
  0 --:--:-- --:--:-- --:--:-- 22500

real	0m0.006s
user	0m0.001s
sys	0m0.004s
  1. testing container -> container communication via dnsname
$ sudo podman container run --rm --network test01 fedora:33 bash -c "time curl web01"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    45  100    45    0     0     54      0 --:--:-- --:--:-- --:--:--    54
<html><body><h1>It works!</h1></body></html>

real	0m0.835s
user	0m0.003s
sys	0m0.003s

As you can see, steps 1 - 4 seem ok, but step 5 shows a real time x150+ slower than the other examples. This is reproducible with all kinds of traffic, as soon as dnsname name resolving is involved. Adding this up in a construct like Nextcloud, you will see huge impacts.

user -> traefik -> nextcloud-web -> nextcloud-php  -> nextcloud-db

Additonal information

$ podman --version
podman version 2.2.1

$ rpm -qa | grep podman
podman-2.2.1-1.fc33.x86_64
podman-docker-2.2.1-1.fc33.noarch
podman-plugins-2.2.1-1.fc33.x86_64

$ rpm -qa | grep dnsmasq
dnsmasq-2.83-1.fc33.x86_64

It would be awesome to get some insights here. Maybe I am doing it wrong? Are there additional parameters needed?

Please also feel free to reach out to me for any additonal information.

@dschier-wtd
Copy link
Author

Update for a redo in the same container:

$ sudo podman run -it --name client01 --network test01 fedora:33 bash
[root@534fd4a9ce91 /]# time curl web01
<html><body><h1>It works!</h1></body></html>

real	0m0.948s
user	0m0.003s
sys	0m0.011s
[root@534fd4a9ce91 /]# time curl web01
<html><body><h1>It works!</h1></body></html>

real	0m0.954s
user	0m0.007s
sys	0m0.004s
[root@534fd4a9ce91 /]# time curl web01
<html><body><h1>It works!</h1></body></html>

real	0m0.813s
user	0m0.006s
sys	0m0.005s

@dschier-wtd
Copy link
Author

dschier-wtd commented Feb 6, 2021

Additional update / workaround:

Using the internal fqdn (excluding the search domain) solves the issue:

[root@95b42f3e3572 /]# time curl web01
<html><body><h1>It works!</h1></body></html>

real	0m0.974s
user	0m0.006s
sys	0m0.006s

[root@95b42f3e3572 /]# time curl web01.dns.podman
<html><body><h1>It works!</h1></body></html>

real	0m0.008s
user	0m0.003s
sys	0m0.005s

For me, this is good enough, but maybe worth an inspection how dnsname/dnsmasq are resolving search domains/priorities these. Maybe internet resolving is tried first and times out or so. Not sure.

@dschier-wtd dschier-wtd changed the title Bug: massiv latency impact when using dnsname Bug: massiv latency impact when using dnsname (workaround in comment 3) Feb 6, 2021
@dschier-wtd
Copy link
Author

@baude Dunno, if this may impact the docker-compose functionality of podman3.0, but it may be worth a look. it is very common to define multiple networks in docker-compose and communicate via hostnames.

@Luap99
Copy link
Member

Luap99 commented Feb 23, 2021

@daniel-wtd can you try with --dns-search dns.podman for the podman run command

@dschier-wtd
Copy link
Author

Hi,

I started both of the containers with --dns-search dns.podman. Please find the results below. Looking good.

sudo podman container run --rm --network test01 --dns-search dns.podman fedora:33 bash -c "time curl web01"

real    0m0.005s
user    0m0.002s
sys     0m0.003s
sudo podman container run --rm --network test01 --dns-search dns.podman fedora:33 bash -c "time curl example.com"

real    0m0.227s
user    0m0.002s
sys     0m0.004s

@Luap99
Copy link
Member

Luap99 commented Feb 24, 2021

OK I think we should add this automatically when dnsname is used. In order to do so dnsname has to add the dns search domain to the cni result and podman has to read the search domain and add it to resolv.conf.

@dschier-wtd
Copy link
Author

dschier-wtd commented Feb 24, 2021

Sounds like a plan. There may be the situation like:

networks

  • proxy-net
    dns=proxy.podman
  • app-net
    dns=app.podman
  • db-net
    dns=db.podman

containers

  • proxy01
    networks=proxy-net
  • app01
    networks=proxy-net, app-net
  • db01
    networks=app-net, db-net

And I am not sure, if there are limitations in the resolvers. (count of dns search entries, dns server entries)

@Luap99
Copy link
Member

Luap99 commented Feb 24, 2021

Note that dnsname currently only works for one attached network, see containers/podman#8399, containers/podman#9492 and #12

@Luap99
Copy link
Member

Luap99 commented Feb 24, 2021

#57 and containers/podman#9501 should fix this

@rhatdan
Copy link
Member

rhatdan commented Mar 8, 2021

@Luap99 can we close this issue now?

@Luap99
Copy link
Member

Luap99 commented Mar 8, 2021

Yes

@rhatdan rhatdan closed this as completed Mar 9, 2021
@dschier-wtd
Copy link
Author

Thanks a lot everybody :)

@akash0x53
Copy link

Why this issue occurs? Any ways to reproduce this issue?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants