Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Memory Input Usage Not Consistent with Docker Engine's stats #10640

Closed
mbentley opened this issue Feb 13, 2022 · 3 comments
Closed

Docker Memory Input Usage Not Consistent with Docker Engine's stats #10640

mbentley opened this issue Feb 13, 2022 · 3 comments
Labels
area/nvidia bug unexpected problem or unintended behavior

Comments

@mbentley
Copy link

mbentley commented Feb 13, 2022

Relevant telegraf.conf

# all other defaults are set
[[inputs.docker]]                         
  endpoint = "unix:///var/run/docker.sock"

Logs from Telegraf

# telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d --test --input-filter docker | grep plex
2022-02-13T16:51:23Z I! Starting Telegraf 1.21.3
2022-02-13T16:51:23Z I! Loaded inputs: docker
2022-02-13T16:51:23Z I! Loaded aggregators:
2022-02-13T16:51:23Z I! Loaded processors:
2022-02-13T16:51:23Z W! Outputs are not used in testing mode!
2022-02-13T16:51:23Z I! Tags enabled: host=athena
2022-02-13T16:51:23Z W! [inputs.docker] 'perdevice' setting is set to 'true' so 'blkio' and 'network' metrics will be collected. Please set it to 'false' and use 'perdevice_include' instead to control this behaviour as 'perdevice' will be deprecated
2022-02-13T16:51:23Z W! [inputs.docker] 'total' setting is set to 'false' so 'blkio' and 'network' metrics will not be collected. Please set it to 'true' and use 'total_include' instead to control this behaviour as 'total' will be deprecated
> docker_container_status,container_image=plexinc/pms-docker,container_name=plex,container_status=running,container_version=unknown,engine_host=athena,host=athena,server_version=20.10.12 container_id="64c8abf723d9f3b136e1ffa5841a0b6fa254228e948ced8020e3ef91d3aeba2b",exitcode=0i,finished_at=1644506302938896168i,oomkilled=false,pid=3897902i,started_at=1644507836533416653i,uptime_ns=263249512760295i 1644771086000000000
> docker_container_mem,container_image=plexinc/pms-docker,container_name=plex,container_status=running,container_version=unknown,engine_host=athena,host=athena,server_version=20.10.12 active_anon=540672i,active_file=10407936i,container_id="64c8abf723d9f3b136e1ffa5841a0b6fa254228e948ced8020e3ef91d3aeba2b",inactive_anon=147820544i,inactive_file=8379084800i,limit=8589934592i,max_usage=0i,pgfault=1887028803i,pgmajfault=858i,unevictable=0i,usage=8587341824i,usage_percent=99.96981620788574 1644771086000000000
> docker_container_cpu,container_image=plexinc/pms-docker,container_name=plex,container_status=running,container_version=unknown,cpu=cpu-total,engine_host=athena,host=athena,server_version=20.10.12 container_id="64c8abf723d9f3b136e1ffa5841a0b6fa254228e948ced8020e3ef91d3aeba2b",throttling_periods=1311336i,throttling_throttled_periods=45i,throttling_throttled_time=3417204000i,usage_in_kernelmode=8200527599000i,usage_in_usermode=13830553385000i,usage_percent=4.256972151898735,usage_system=12326356720000000i,usage_total=22031080985000i 1644771086000000000
> docker_container_net,container_image=plexinc/pms-docker,container_name=plex,container_status=running,container_version=unknown,engine_host=athena,host=athena,network=eth0,server_version=20.10.12 container_id="64c8abf723d9f3b136e1ffa5841a0b6fa254228e948ced8020e3ef91d3aeba2b",rx_bytes=796952777i,rx_dropped=0i,rx_errors=0i,rx_packets=5113303i,tx_bytes=45161602957i,tx_dropped=0i,tx_errors=0i,tx_packets=2581821i 1644771086000000000
> docker_container_blkio,container_image=plexinc/pms-docker,container_name=plex,container_status=running,container_version=unknown,device=230:192,engine_host=athena,host=athena,server_version=20.10.12 container_id="64c8abf723d9f3b136e1ffa5841a0b6fa254228e948ced8020e3ef91d3aeba2b",io_service_bytes_recursive_read=11804672i,io_service_bytes_recursive_write=12212419072i 1644771086000000000

System info

Telegraf 1.21.3, Debian 11, Docker 20.10.12

Docker

This isn't specific to this one container but just as an example of how I am running the container:

docker run -d --restart=always --name plex \                                                                    
  --runtime=nvidia \                                                                                            
  --gpus all,capabilities=video \                                                                               
  --no-healthcheck \                                                                                            
  --hostname plex \                                                                                             
  --network macvlan1 \                                                                                          
  --ip 192.168.2.206 \                                                                                          
  -e TZ="US/Eastern" \                                                                                          
  -e PLEX_CLAIM="" \                                                                  
  -e PLEX_UID=500 \                                                                                             
  -e PLEX_GID=1501 \                                                                                            
  -e CUDA_DRIVER_CAPABILITIES="compute,video,utility" \                                                         
  -e NVIDIA_DRIVER_CAPABILITIES="compute,video,utility" \                                                       
  -e NVIDIA_VISIBLE_DEVICES="all" \                                                                             
  --mount type=bind,source=/zfs/apps/plex/database,destination=/config,readonly=false \                         
  --mount type=tmpfs,destination=/transcode,tmpfs-size=10G \                                                    
  --mount type=bind,source=/zfs/storage/media,destination=/data,readonly=false \                  
  --cpus 6 \                                                                                                    
  --memory 8g \                                                                                                 
  --memory-swap 9g \                                                                                            
  plexinc/pms-docker                                                                                           
# docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.7.1-docker)

Server:
 Containers: 52
  Running: 51
  Paused: 0
  Stopped: 1
 Images: 72
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: local
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux nvidia runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc version: v1.0.2-0-g52b36a2
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.10.0-11-amd64
 Operating System: Debian GNU/Linux 11 (bullseye)
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 125.5GiB
 Name: athena
 ID: BDEV:YXXO:FAGS:OHLJ:N64Q:LQTC:JJDK:TCKM:WN6T:P4HG:LKEQ:HHTE
 Docker Root Dir: /zfs/apps/docker
 Debug Mode: false
 Username: mbentley
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  https://registry-mirror.casa.mbentley.net/
 Live Restore Enabled: false

Steps to reproduce

Not sure I can specifically reproduce this or know how to trigger it.

  1. Run a container, the memory usage stats do not seem to match what Docker sees:

plex

The container does not seem to be having memory issues:

# docker exec -it plex top -bn1
top - 11:59:17 up 8 days, 23:37,  0 users,  load average: 0.95, 1.00, 1.04
Tasks:   9 total,   1 running,   8 sleeping,   0 stopped,   0 zombie
%Cpu(s): 20.0 us,  0.0 sy,  0.0 ni, 80.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem : 128494.8 total,  16400.5 free,  95074.0 used,  17020.2 buff/cache
MiB Swap:   4096.0 total,   4032.0 free,     64.0 used.  32140.6 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
      1 root      20   0     204      4      0 S   0.0   0.0   0:00.04 s6-svsc+
     39 root      20   0     204      4      0 S   0.0   0.0   0:00.00 s6-supe+
    235 root      20   0     204      4      0 S   0.0   0.0   0:00.00 s6-supe+
    238 plex      20   0 4759960 116800  37820 S   0.0   0.1  78:39.32 PMS Run+
    275 plex      35  15   91604  72204  10880 S   0.0   0.1   5:14.15 Plex Sc+
    316 plex      20   0   40200  10896   9308 S   0.0   0.0   3:50.92 Plex Tu+
  25446 plex      20   0    2564   1008    884 S   0.0   0.0   1:22.84 EasyAud+
  36575 plex      20   0   46264  17788   9292 S   0.0   0.0   8:27.50 Plex Tr+
  36679 root      20   0    7728   3356   2916 R   0.0   0.0   0:00.03 top

Influx query in Grafana:
query

SELECT mean("usage_percent") FROM "docker_container_mem" WHERE ("container_name" =~ /^$ContainerName$/) AND $timeFilter GROUP BY time($__interval), "container_name" fill(null)

Expected behavior

Stats would match what docker stats shows:

# docker stats plex --no-stream
CONTAINER ID   NAME      CPU %     MEM USAGE / LIMIT   MEM %     NET I/O          BLOCK I/O         PIDS
64c8abf723d9   plex      6.99%     198MiB / 8GiB       2.42%     799MB / 45.4GB   12.2MB / 12.4GB   89

Actual behavior

plex

Additional info

No response

@mbentley mbentley added the bug unexpected problem or unintended behavior label Feb 13, 2022
@powersj
Copy link
Contributor

powersj commented Feb 14, 2022

I believe this is fixed with #10491, you could try one of our nightlies or wait for our next bug fix release (this week I believe) and try that out.

Thanks!

@powersj powersj closed this as completed Feb 14, 2022
@mbentley
Copy link
Author

mbentley commented Feb 14, 2022

Ah shoot, my search-fu failed me 🤦 Sorry and thanks for digging up the PR mentioned, will take a look.

@powersj
Copy link
Contributor

powersj commented Feb 14, 2022

Ah shoot, my search-fu failed me facepalm Sorry and thanks for digging up the PR mentioned, will take a look.

No worries! If that doesn't seem to resolve the issue for you, please do let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/nvidia bug unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants