Some containers are not stopped during service update #1393

gileri · 2018-05-21T15:15:15Z

Summary

Some containers that should be stopped (and are seen by ECS as stopped) when doing a service update stay up.

Description

We noticed that certain containers are not stopped during regular ECS deployments (new task definitions containing image changes).
To narrow down what could fail, the problem service update using :

ecs update-service --cluster <cluster> --service <service> --force-new-deployment

This service shouldn't allow concurrent containers anyway :

Minimum healthy percent = 0%
Maximum healthy percent = 100%
Desired count = 1

Expected Behavior

Containers part of a stopped task should be always be stopped.

Observed Behavior

Certain containers are sometimes not stopped (in around 1 in 5 services updates) and survive future service updates.
They are seen as stopped by ECS :

"KnownExitCode": null,
"KnownStatus": "STOPPED",
"SentStatus": "STOPPED",

I've gone through ECS and docker logs and did not manage to pinpoint the source of this issue.

Environment Details

AMI : Amazon ECS-Optimized Amazon Linux AMI 2018.03.l
ECS agent version 1.17.3
Cluster of one EC2 instance, but also occuring on multiple-instances clusters.

The ECS instance has been rebooted, and docker system purge --all has been run minutes before the described occurence.

Docker info :

Containers: 131                                       
 Running: 25         
 Paused: 0                 
 Stopped: 106
Images: 13                  
Server Version: 17.12.1-ce
Storage Driver: devicemapper
 Pool Name: docker-docker--pool
 Pool Blocksize: 524.3kB
 Base Device Size: 10.74GB
 Backing Filesystem: ext4
 Udev Sync Supported: true
 Data Space Used: 9.06GB
 Data Space Total: 23.33GB
 Data Space Available: 14.27GB
 Metadata Space Used: 9.212MB
 Metadata Space Total: 33.55MB
 Metadata Space Available: 24.34MB
 Thin Pool Minimum Free Space: 2.333GB
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: true
 Deferred Deleted Device Count: 0
 Library Version: 1.02.135-RHEL7 (2016-11-16)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9b55aab90508bd389d7654c4baf173a981477d55
runc version: 9f9c96235cc97674e935002fc3d78361b696a69e
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.14.33-51.37.amzn1.x86_64
Operating System: Amazon Linux AMI 2018.03
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.792GiB
Name: ip-172-20-0-12
ID: UB62:W2YM:DYBG:BBT3:BDO7:VMEH:3XC5:REFC:FF7X:MV33:VLNH:MKIN
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Supporting Log Snippets

I've collected all logs using https://github.com/awslabs/ecs-logs-collector, but it proves time consuming to anonymize those logs. Please don't hesitate to ask for more details or logs.

There should be only df56811ff7a running (docker info) :

df56811ff7acd478f1f887a2b3e78b5e8c1f28bee9dc6d6347c04c169ac53907   <repo>.amazonaws.com/php:PR-1884-1-stage-cron   "/usr/local/bin/entrypoint.sh /usr/bin/php /var/www/example/symfony mails:task --env=stage"                        3 minutes ago       Up 3 minutes                  9000/tcp                                                                                     ecs-example-stage-cron-1-179-example-stage-mailstask-a6a1dcf1f3b681be8101
257e5551a32366def005dd45979d0f19cdb30f09f6302af19a1c7b29238e0dae   <repo>.amazonaws.com/php:PR-1884-1-stage-cron   "/usr/local/bin/entrypoint.sh /usr/bin/php /var/www/example/symfony mails:task --env=stage"                        4 minutes ago       Up 4 minutes                  9000/tcp                                                                                     ecs-example-stage-cron-1-179-example-stage-mailstask-dea2dccf9294c0bc1c00
f54fa2bbe2a84e3af73a8c917772f458f54ae7fefe5c6c095977faf7ca00837c   <repo>.amazonaws.com/php:PR-1884-1-stage-cron   "/usr/local/bin/entrypoint.sh /usr/bin/php /var/www/example/symfony mails:task --env=stage"                        5 minutes ago       Up 5 minutes                  9000/tcp                                                                                     ecs-example-stage-cron-1-179-example-stage-mailstask-9c9d93aee2a0b1dba201

ECS data for the task containing container 257e5551a323 :

{
    "Arn": "arn:aws:ecs:eu-central-1:123456890123:task/da8ea8c0-a080-44cd-9a32-6f850b797da3",
    "Containers": [
        {
            "ApplyingError": null,
            "Command": [
                "/usr/bin/php",
                "/var/www/website/symfony",
                "mails:task",
                "--env=stage"
            ],
            "Cpu": 0,
            "EntryPoint": null,
            "Essential": true,
            "Image": "repo.amazonaws.com/php:PR-1884-1-stage-cron",
            "ImageID": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
            "IsInternal": "NORMAL",
            "KnownExitCode": null,
            "KnownPortBindings": null,
            "KnownStatus": "STOPPED",
            "Links": null,
            "LogsAuthStrategy": "",
            "Memory": 150,
            "Name": "env-stage-mails_task",
            "RunDependencies": null,
            "SentStatus": "STOPPED",
            "TransitionDependencySet": {
                "ContainerDependencies": null
            },
            "desiredStatus": "STOPPED",
            "dockerConfig": {
                "config": "{}",
                "hostConfig": "{\"DnsSearch\":[\"int.stage.example.com\"],\"LogConfig\":{\"Type\":\"awslogs\",\"Config\":{\"awslogs-group\":\"env-stage-workers\",\"awslogs-stream\":\"php/env-stage-mails_task/da8ea8c0-a080-44cd-9a32-6f850b797da3\",\"awslogs-region\":\"eu-central-1\"}},\"MemoryReservation\":157286400,\"CapAdd\":[],\"CapDrop\":[]}",
                "version": "1.21"
            },
            "environment": {
                "PHP_MEMORY_LIMIT": "128M",
                "TZ": "Europe/Paris",
            },
            "metadataFileUpdated": false,
            "mountPoints": [
                {
                    "containerPath": "/var/www/website/web/uploads",
                    "readOnly": false,
                    "sourceVolume": "example-uploads"
                }
            ],
            "overrides": {
                "command": null
            },
            "portMappings": [],
            "registryAuthentication": {
                "ecrAuthData": {
                    "endpointOverride": "",
                    "region": "eu-central-1",
                    "registryId": "123456890123",
                    "useExecutionRole": false
                },
                "type": "ecr"
            },
            "volumesFrom": []
        },
        {
            "ApplyingError": null,
            "Command": [
                "/usr/bin/php",
                "/var/www/website/symfony",
                "utils:purgeNotification",
                "--env=stage"
            ],
            "Cpu": 0,
            "EntryPoint": null,
            "Essential": true,
            "Image": "repo.amazonaws.com/php:PR-1884-1-stage-cron",
            "ImageID": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
            "IsInternal": "NORMAL",
            "KnownExitCode": 137,
            "KnownPortBindings": null,
            "KnownStatus": "STOPPED",
            "Links": null,
            "LogsAuthStrategy": "",
            "Memory": 150,
            "Name": "env-stage-utils_purgeNotification",
            "RunDependencies": null,
            "SentStatus": "STOPPED",
            "TransitionDependencySet": {
                "ContainerDependencies": null
            },
            "desiredStatus": "STOPPED",
            "dockerConfig": {
                "config": "{}",
                "hostConfig": "{\"DnsSearch\":[\"int.stage.example.com\"],\"LogConfig\":{\"Type\":\"awslogs\",\"Config\":{\"awslogs-group\":\"env-stage-workers\",\"awslogs-stream\":\"php/env-stage-utils_purgeNotification/da8ea8c0-a080-44cd-9a32-6f850b797da3\",\"awslogs-region\":\"eu-central-1\"}},\"MemoryReservation\":157286400,\"CapAdd\":[],\"CapDrop\":[]}",
                "version": "1.21"
            },
            "environment": {
                "PHP_MEMORY_LIMIT": "128M",
                "TZ": "Europe/Paris",
            },
            "metadataFileUpdated": false,
            "mountPoints": [
                {
                    "containerPath": "/var/www/website/web/uploads",
                    "readOnly": false,
                    "sourceVolume": "example-uploads"
                }
            ],
            "overrides": {
                "command": null
            },
            "portMappings": [],
            "registryAuthentication": {
                "ecrAuthData": {
                    "endpointOverride": "",
                    "region": "eu-central-1",
                    "registryId": "123456890123",
                    "useExecutionRole": false
                },
                "type": "ecr"
            },
            "volumesFrom": []
        },
        {
            "ApplyingError": null,
            "Command": [
                "/usr/bin/php",
                "/var/www/website/symfony",
                "mails:dailyWorkshopRecap",
                "--env=stage"
            ],
            "Cpu": 0,
            "EntryPoint": null,
            "Essential": true,
            "Image": "repo.amazonaws.com/php:PR-1884-1-stage-cron",
            "ImageID": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
            "IsInternal": "NORMAL",
            "KnownExitCode": null,
            "KnownPortBindings": null,
            "KnownStatus": "STOPPED",
            "Links": null,
            "LogsAuthStrategy": "",
            "Memory": 150,
            "Name": "env-stage-mails_dailyWorkshopRecap",
            "RunDependencies": null,
            "SentStatus": "STOPPED",
            "TransitionDependencySet": {
                "ContainerDependencies": null
            },
            "desiredStatus": "STOPPED",
            "dockerConfig": {
                "config": "{}",
                "hostConfig": "{\"DnsSearch\":[\"int.stage.example.com\"],\"LogConfig\":{\"Type\":\"awslogs\",\"Config\":{\"awslogs-group\":\"env-stage-workers\",\"awslogs-stream\":\"php/env-stage-mails_dailyWorkshopRecap/da8ea8c0-a080-44cd-9a32-6f850b797da3\",\"awslogs-region\":\"eu-central-1\"}},\"MemoryReservation\":157286400,\"CapAdd\":[],\"CapDrop\":[]}",
                "version": "1.21"
            },
            "environment": {
                "PHP_MEMORY_LIMIT": "128M",
                "TZ": "Europe/Paris",
            },
            "metadataFileUpdated": false,
            "mountPoints": [
                {
                    "containerPath": "/var/www/website/web/uploads",
                    "readOnly": false,
                    "sourceVolume": "example-uploads"
                }
            ],
            "overrides": {
                "command": null
            },
            "portMappings": [],
            "registryAuthentication": {
                "ecrAuthData": {
                    "endpointOverride": "",
                    "region": "eu-central-1",
                    "registryId": "123456890123",
                    "useExecutionRole": false
                },
                "type": "ecr"
            },
            "volumesFrom": []
        },
        {
            "ApplyingError": null,
            "Command": [
                "/usr/bin/php",
                "/var/www/website/symfony",
                "mails:weeklyTicketingRecap",
                "--env=stage"
            ],
            "Cpu": 0,
            "EntryPoint": null,
            "Essential": true,
            "Image": "repo.amazonaws.com/php:PR-1884-1-stage-cron",
            "ImageID": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
            "IsInternal": "NORMAL",
            "KnownExitCode": 137,
            "KnownPortBindings": null,
            "KnownStatus": "STOPPED",
            "Links": null,
            "LogsAuthStrategy": "",
            "Memory": 150,
            "Name": "env-stage-mails_weeklyTicketingRecap",
            "RunDependencies": null,
            "SentStatus": "STOPPED",
            "TransitionDependencySet": {
                "ContainerDependencies": null
            },
            "desiredStatus": "STOPPED",
            "dockerConfig": {
                "config": "{}",
                "hostConfig": "{\"DnsSearch\":[\"int.stage.example.com\"],\"LogConfig\":{\"Type\":\"awslogs\",\"Config\":{\"awslogs-group\":\"env-stage-workers\",\"awslogs-stream\":\"php/env-stage-mails_weeklyTicketingRecap/da8ea8c0-a080-44cd-9a32-6f850b797da3\",\"awslogs-region\":\"eu-central-1\"}},\"MemoryReservation\":157286400,\"CapAdd\":[],\"CapDrop\":[]}",
                "version": "1.21"
            },
            "environment": {
                "PHP_MEMORY_LIMIT": "128M",
                "TZ": "Europe/Paris",
            },
            "metadataFileUpdated": false,
            "mountPoints": [
                {
                    "containerPath": "/var/www/website/web/uploads",
                    "readOnly": false,
                    "sourceVolume": "example-uploads"
                }
            ],
            "overrides": {
                "command": null
            },
            "portMappings": [],
            "registryAuthentication": {
                "ecrAuthData": {
                    "endpointOverride": "",
                    "region": "eu-central-1",
                    "registryId": "123456890123",
                    "useExecutionRole": false
                },
                "type": "ecr"
            },
            "volumesFrom": []
        },
        {
            "ApplyingError": null,
            "Command": [
                "/usr/bin/php",
                "/var/www/website/symfony",
                "utils:manageAssoAccount",
                "--env=stage"
            ],
            "Cpu": 0,
            "EntryPoint": null,
            "Essential": true,
            "Image": "repo.amazonaws.com/php:PR-1884-1-stage-cron",
            "ImageID": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
            "IsInternal": "NORMAL",
            "KnownExitCode": null,
            "KnownPortBindings": null,
            "KnownStatus": "STOPPED",
            "Links": null,
            "LogsAuthStrategy": "",
            "Memory": 150,
            "Name": "env-stage-utils_manageAssoAccount",
            "RunDependencies": null,
            "SentStatus": "STOPPED",
            "TransitionDependencySet": {
                "ContainerDependencies": null
            },
            "desiredStatus": "STOPPED",
            "dockerConfig": {
                "config": "{}",
                "hostConfig": "{\"DnsSearch\":[\"int.stage.example.com\"],\"LogConfig\":{\"Type\":\"awslogs\",\"Config\":{\"awslogs-group\":\"env-stage-workers\",\"awslogs-stream\":\"php/env-stage-utils_manageAssoAccount/da8ea8c0-a080-44cd-9a32-6f850b797da3\",\"awslogs-region\":\"eu-central-1\"}},\"MemoryReservation\":157286400,\"CapAdd\":[],\"CapDrop\":[]}",
                "version": "1.21"
            },
            "environment": {
                "PHP_MEMORY_LIMIT": "128M",
                "TZ": "Europe/Paris",
            },
            "metadataFileUpdated": false,
            "mountPoints": [
                {
                    "containerPath": "/var/www/website/web/uploads",
                    "readOnly": false,
                    "sourceVolume": "example-uploads"
                }
            ],
            "overrides": {
                "command": null
            },
            "portMappings": [],
            "registryAuthentication": {
                "ecrAuthData": {
                    "endpointOverride": "",
                    "region": "eu-central-1",
                    "registryId": "123456890123",
                    "useExecutionRole": false
                },
                "type": "ecr"
            },
            "volumesFrom": []
        },
        {
            "ApplyingError": null,
            "Command": [
                "/usr/bin/php",
                "/var/www/website/symfony",
                "mails:profileNotification",
                "--env=stage"
            ],
            "Cpu": 0,
            "EntryPoint": null,
            "Essential": true,
            "Image": "repo.amazonaws.com/php:PR-1884-1-stage-cron",
            "ImageID": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
            "IsInternal": "NORMAL",
            "KnownExitCode": 0,
            "KnownPortBindings": null,
            "KnownStatus": "STOPPED",
            "Links": null,
            "LogsAuthStrategy": "",
            "Memory": 150,
            "Name": "env-stage-mails_profileNotification",
            "RunDependencies": null,
            "SentStatus": "STOPPED",
            "TransitionDependencySet": {
                "ContainerDependencies": null
            },
            "desiredStatus": "STOPPED",
            "dockerConfig": {
                "config": "{}",
                "hostConfig": "{\"DnsSearch\":[\"int.stage.example.com\"],\"LogConfig\":{\"Type\":\"awslogs\",\"Config\":{\"awslogs-group\":\"env-stage-workers\",\"awslogs-stream\":\"php/env-stage-mails_profileNotification/da8ea8c0-a080-44cd-9a32-6f850b797da3\",\"awslogs-region\":\"eu-central-1\"}},\"MemoryReservation\":157286400,\"CapAdd\":[],\"CapDrop\":[]}",
                "version": "1.21"
            },
            "environment": {
                "PHP_MEMORY_LIMIT": "128M",
                "TZ": "Europe/Paris",
            },
            "metadataFileUpdated": false,
            "mountPoints": [
                {
                    "containerPath": "/var/www/website/web/uploads",
                    "readOnly": false,
                    "sourceVolume": "example-uploads"
                }
            ],
            "overrides": {
                "command": null
            },
            "portMappings": [],
            "registryAuthentication": {
                "ecrAuthData": {
                    "endpointOverride": "",
                    "region": "eu-central-1",
                    "registryId": "123456890123",
                    "useExecutionRole": false
                },
                "type": "ecr"
            },
            "volumesFrom": []
        }
    ],
    "DesiredStatus": "STOPPED",
    "ENI": null,
    "ExecutionStoppedAt": "2018-05-21T13:26:15.969730897Z",
    "Family": "env-stage-cron-1",
    "KnownStatus": "STOPPED",
    "KnownTime": "2018-05-21T13:26:46.556684068Z",
    "MemoryCPULimitsEnabled": true,
    "PullStartedAt": "2018-05-21T13:26:13.842619731Z",
    "PullStoppedAt": "2018-05-21T13:26:14.100596311Z",
    "SentStatus": "STOPPED",
    "StartSequenceNumber": 1746,
    "StopSequenceNumber": 0,
    "Version": "179",
    "executionCredentialsID": "",
    "volumes": [
        {
            "host": {
                "sourcePath": "/var/example-uploads"
            },
            "name": "example-uploads"
        }
    ]
},

Container info for 257e5551a323:

[
    {
        "Id": "257e5551a32366def005dd45979d0f19cdb30f09f6302af19a1c7b29238e0dae",
        "Created": "2018-05-21T13:26:14.087419271Z",
        "Path": "/usr/local/bin/entrypoint.sh",
        "Args": [
            "/usr/bin/php",
            "/var/www/example/symfony",
            "mails:task",
            "--env=stage"
        ],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 31682,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2018-05-21T13:26:16.284080656Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
        "ResolvConfPath": "/var/lib/docker/containers/257e5551a32366def005dd45979d0f19cdb30f09f6302af19a1c7b29238e0dae/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/257e5551a32366def005dd45979d0f19cdb30f09f6302af19a1c7b29238e0dae/hostname",
        "HostsPath": "/var/lib/docker/containers/257e5551a32366def005dd45979d0f19cdb30f09f6302af19a1c7b29238e0dae/hosts",
        "LogPath": "",
        "Name": "/ecs-example-stage-cron-1-179-example-stage-mailstask-dea2dccf9294c0bc1c00",
        "RestartCount": 0,
        "Driver": "devicemapper",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "/var/example-uploads:/var/www/example/web/uploads"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "awslogs",
                "Config": {
                    "awslogs-group": "example-stage-workers",
                    "awslogs-region": "eu-central-1",
                    "awslogs-stream": "php/example-stage-mails_task/da8ea8c0-a080-44cd-9a32-6f850b797da3"
                }
            },
            "NetworkMode": "default",
            "PortBindings": null,
            "RestartPolicy": {
                "Name": "",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "Dns": null,
            "DnsOptions": null,
            "DnsSearch": [
                "int.stage.example.com"
            ],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "shareable",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 2,
            "Memory": 157286400,
            "NanoCpus": 0,
            "CgroupParent": "/ecs/da8ea8c0-a080-44cd-9a32-6f850b797da3",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": null,
            "DeviceCgroupRules": null,
            "DiskQuota": 0,
            "KernelMemory": 0,
            "MemoryReservation": 157286400,
            "MemorySwap": 314572800,
            "MemorySwappiness": 0,
            "OomKillDisable": false,
            "PidsLimit": 0,
            "Ulimits": [
                {
                    "Name": "nofile",
                    "Hard": 4096,
                    "Soft": 1024
                }
            ],
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0
        },
        "GraphDriver": {
            "Data": {
                "DeviceId": "6798",
                "DeviceName": "docker-202:1-263286-a55538d948074e8082593e29d08ea4d2d57bd81fe3efa500d2f2d4a54c080ad8",
                "DeviceSize": "10737418240"
            },
            "Name": "devicemapper"
        },
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/var/example-uploads",
                "Destination": "/var/www/example/web/uploads",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Type": "volume",
                "Name": "ef69e07b38ac24fb9d371db44261cb259f8627b99a0bcbb1976a039c1c31deb1",
                "Source": "/var/lib/docker/volumes/ef69e07b38ac24fb9d371db44261cb259f8627b99a0bcbb1976a039c1c31deb1/_data",
                "Destination": "/var/composer-cache",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            }
        ],
        "Config": {
            "Hostname": "257e5551a323",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "9000/tcp": {}
            },
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PHP_MEMORY_LIMIT=128M",
                "TZ=Europe/Paris",
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
            ],
            "Cmd": [
                "/usr/bin/php",
                "/var/www/example/symfony",
                "mails:task",
                "--env=stage"
            ],
            "Image": "124766415242.dkr.ecr.eu-central-1.amazonaws.com/php:PR-1884-1-stage-cron",
            "Volumes": {
                "/var/composer-cache": {},
                "/var/www/example/web/uploads": {}
            },
            "WorkingDir": "/var/www/example",
            "Entrypoint": [
                "/usr/local/bin/entrypoint.sh"
            ],
            "OnBuild": null,
            "Labels": {
                "com.amazonaws.ecs.cluster": "example-stage",
                "com.amazonaws.ecs.container-name": "example-stage-mails_task",
                "com.amazonaws.ecs.task-arn": "arn:aws:ecs:eu-central-1:124766415242:task/da8ea8c0-a080-44cd-9a32-6f850b797da3",
                "com.amazonaws.ecs.task-definition-family": "example-stage-cron-1",
                "com.amazonaws.ecs.task-definition-version": "179"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "c9a25205efbd78c301295f7d950e30468b4b80259c6e99703ec947299fe23612",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {
                "9000/tcp": null
            },
            "SandboxKey": "/var/run/docker/netns/c9a25205efbd",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "ee47c3b433f27926a17b9c9c4c222f00dfa14a3614c337b71c9f1f3b18c704d1",
            "Gateway": "172.17.0.1",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "172.17.0.7",
            "IPPrefixLen": 16,
            "IPv6Gateway": "",
            "MacAddress": "02:42:ac:11:00:07",
            "Networks": {
                "bridge": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "NetworkID": "dc279972e1950c8f86f380f434f0eb73663c9ab5f40809db776d361833b95f20",
                    "EndpointID": "ee47c3b433f27926a17b9c9c4c222f00dfa14a3614c337b71c9f1f3b18c704d1",
                    "Gateway": "172.17.0.1",
                    "IPAddress": "172.17.0.7",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "02:42:ac:11:00:07",
                    "DriverOpts": null
                }
            }
        }
    }
]

The text was updated successfully, but these errors were encountered:

richardpen · 2018-05-21T22:41:33Z

@gileri I was trying to reproduce this issue with a service that has desired count 1, minimumHealthyPercent=50 and maximumHealthyPercent=100 but wasn't able to reproduce this issue. It would be helpful if you can share the service name, the maximumHealthyPercent, minimumHealthyPercent and the agent logs that experienced this issue? You can send it to me: penyin (at) amazon.com

Thanks,
Peng

gileri · 2018-05-22T09:11:30Z

Sorry, I wrongly formatted the code blocks ; they are now fixed and include :

Minimum healthy percent = 0%
Maximum healthy percent = 100%
Desired count = 1

I removed sensitive informations from the ecs-agent logs downloadable here.

The service name is example-stage-cron-1.

haikuoliu · 2018-05-31T18:50:46Z

Hi @gileri,

Sorry for my late response.

I investigated on the logs and found out the root cause: the task is started and then there is a stop immediately. Agent sets the container to be stopped, and then docker sends a docker change event to Agent indicating the container is running, Agent is supposed to stop the container again in this case, and this is handle by this go routine. However, it only handles once due to some reasons.

I will mark it as a bug.

Thanks,
Haikuo

gileri · 2018-05-31T20:41:07Z

Thank you @haikuoliu for the analysis ! I'm not familiar with go or ECS code but I sure can provide additional debug logs or tests.

haikuoliu · 2018-05-31T21:05:18Z

@gileri

I think the logs that you provided are enough, the bug seems clear there and we will let you know when we fix it.

I saw from logs that the containers in your task gets stopped too quick, this will cause the bug that container cannot be stopped. Try to avoid this situation will be a mitigation.

Thanks for bringing this to our attention!

adnxn · 2018-08-07T21:24:02Z

closing issue, fix is included with latest release.

gileri changed the title ~~Containers are not stopped~~ Some containers are not stopped during service update May 21, 2018

richardpen added the more info needed label May 21, 2018

richardpen removed the more info needed label May 22, 2018

haikuoliu added the kind/bug label May 31, 2018

haikuoliu added the scope/ECS Agent label May 31, 2018

haikuoliu mentioned this issue Jul 11, 2018

engine: don't stop container when applied status is running #1446

Merged

8 tasks

haikuoliu added this to the 1.20.0 milestone Jul 25, 2018

adnxn closed this as completed Aug 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some containers are not stopped during service update #1393

Some containers are not stopped during service update #1393

gileri commented May 21, 2018 •

edited

Loading

richardpen commented May 21, 2018

gileri commented May 22, 2018

haikuoliu commented May 31, 2018

gileri commented May 31, 2018

haikuoliu commented May 31, 2018

adnxn commented Aug 7, 2018

Some containers are not stopped during service update #1393

Some containers are not stopped during service update #1393

Comments

gileri commented May 21, 2018 • edited Loading

Summary

Description

Expected Behavior

Observed Behavior

Environment Details

Supporting Log Snippets

richardpen commented May 21, 2018

gileri commented May 22, 2018

haikuoliu commented May 31, 2018

gileri commented May 31, 2018

haikuoliu commented May 31, 2018

adnxn commented Aug 7, 2018

gileri commented May 21, 2018 •

edited

Loading