Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some containers are not stopped during service update #1393

Closed
gileri opened this issue May 21, 2018 · 6 comments
Closed

Some containers are not stopped during service update #1393

gileri opened this issue May 21, 2018 · 6 comments

Comments

@gileri
Copy link

gileri commented May 21, 2018

Summary

Some containers that should be stopped (and are seen by ECS as stopped) when doing a service update stay up.

Description

We noticed that certain containers are not stopped during regular ECS deployments (new task definitions containing image changes).
To narrow down what could fail, the problem service update using :

ecs update-service --cluster <cluster> --service <service> --force-new-deployment

This service shouldn't allow concurrent containers anyway :

Minimum healthy percent = 0%
Maximum healthy percent = 100%
Desired count = 1

Expected Behavior

Containers part of a stopped task should be always be stopped.

Observed Behavior

Certain containers are sometimes not stopped (in around 1 in 5 services updates) and survive future service updates.
They are seen as stopped by ECS :

"KnownExitCode": null,
"KnownStatus": "STOPPED",
"SentStatus": "STOPPED",

I've gone through ECS and docker logs and did not manage to pinpoint the source of this issue.

Environment Details

AMI : Amazon ECS-Optimized Amazon Linux AMI 2018.03.l
ECS agent version 1.17.3
Cluster of one EC2 instance, but also occuring on multiple-instances clusters.

The ECS instance has been rebooted, and docker system purge --all has been run minutes before the described occurence.

Docker info :

Containers: 131                                       
 Running: 25         
 Paused: 0                 
 Stopped: 106
Images: 13                  
Server Version: 17.12.1-ce
Storage Driver: devicemapper
 Pool Name: docker-docker--pool
 Pool Blocksize: 524.3kB
 Base Device Size: 10.74GB
 Backing Filesystem: ext4
 Udev Sync Supported: true
 Data Space Used: 9.06GB
 Data Space Total: 23.33GB
 Data Space Available: 14.27GB
 Metadata Space Used: 9.212MB
 Metadata Space Total: 33.55MB
 Metadata Space Available: 24.34MB
 Thin Pool Minimum Free Space: 2.333GB
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: true
 Deferred Deleted Device Count: 0
 Library Version: 1.02.135-RHEL7 (2016-11-16)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9b55aab90508bd389d7654c4baf173a981477d55
runc version: 9f9c96235cc97674e935002fc3d78361b696a69e
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.14.33-51.37.amzn1.x86_64
Operating System: Amazon Linux AMI 2018.03
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.792GiB
Name: ip-172-20-0-12
ID: UB62:W2YM:DYBG:BBT3:BDO7:VMEH:3XC5:REFC:FF7X:MV33:VLNH:MKIN
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Supporting Log Snippets

I've collected all logs using https://github.com/awslabs/ecs-logs-collector, but it proves time consuming to anonymize those logs. Please don't hesitate to ask for more details or logs.

There should be only df56811ff7a running (docker info) :

df56811ff7acd478f1f887a2b3e78b5e8c1f28bee9dc6d6347c04c169ac53907   <repo>.amazonaws.com/php:PR-1884-1-stage-cron   "/usr/local/bin/entrypoint.sh /usr/bin/php /var/www/example/symfony mails:task --env=stage"                        3 minutes ago       Up 3 minutes                  9000/tcp                                                                                     ecs-example-stage-cron-1-179-example-stage-mailstask-a6a1dcf1f3b681be8101
257e5551a32366def005dd45979d0f19cdb30f09f6302af19a1c7b29238e0dae   <repo>.amazonaws.com/php:PR-1884-1-stage-cron   "/usr/local/bin/entrypoint.sh /usr/bin/php /var/www/example/symfony mails:task --env=stage"                        4 minutes ago       Up 4 minutes                  9000/tcp                                                                                     ecs-example-stage-cron-1-179-example-stage-mailstask-dea2dccf9294c0bc1c00
f54fa2bbe2a84e3af73a8c917772f458f54ae7fefe5c6c095977faf7ca00837c   <repo>.amazonaws.com/php:PR-1884-1-stage-cron   "/usr/local/bin/entrypoint.sh /usr/bin/php /var/www/example/symfony mails:task --env=stage"                        5 minutes ago       Up 5 minutes                  9000/tcp                                                                                     ecs-example-stage-cron-1-179-example-stage-mailstask-9c9d93aee2a0b1dba201

ECS data for the task containing container 257e5551a323 :

{
    "Arn": "arn:aws:ecs:eu-central-1:123456890123:task/da8ea8c0-a080-44cd-9a32-6f850b797da3",
    "Containers": [
        {
            "ApplyingError": null,
            "Command": [
                "/usr/bin/php",
                "/var/www/website/symfony",
                "mails:task",
                "--env=stage"
            ],
            "Cpu": 0,
            "EntryPoint": null,
            "Essential": true,
            "Image": "repo.amazonaws.com/php:PR-1884-1-stage-cron",
            "ImageID": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
            "IsInternal": "NORMAL",
            "KnownExitCode": null,
            "KnownPortBindings": null,
            "KnownStatus": "STOPPED",
            "Links": null,
            "LogsAuthStrategy": "",
            "Memory": 150,
            "Name": "env-stage-mails_task",
            "RunDependencies": null,
            "SentStatus": "STOPPED",
            "TransitionDependencySet": {
                "ContainerDependencies": null
            },
            "desiredStatus": "STOPPED",
            "dockerConfig": {
                "config": "{}",
                "hostConfig": "{\"DnsSearch\":[\"int.stage.example.com\"],\"LogConfig\":{\"Type\":\"awslogs\",\"Config\":{\"awslogs-group\":\"env-stage-workers\",\"awslogs-stream\":\"php/env-stage-mails_task/da8ea8c0-a080-44cd-9a32-6f850b797da3\",\"awslogs-region\":\"eu-central-1\"}},\"MemoryReservation\":157286400,\"CapAdd\":[],\"CapDrop\":[]}",
                "version": "1.21"
            },
            "environment": {
                "PHP_MEMORY_LIMIT": "128M",
                "TZ": "Europe/Paris",
            },
            "metadataFileUpdated": false,
            "mountPoints": [
                {
                    "containerPath": "/var/www/website/web/uploads",
                    "readOnly": false,
                    "sourceVolume": "example-uploads"
                }
            ],
            "overrides": {
                "command": null
            },
            "portMappings": [],
            "registryAuthentication": {
                "ecrAuthData": {
                    "endpointOverride": "",
                    "region": "eu-central-1",
                    "registryId": "123456890123",
                    "useExecutionRole": false
                },
                "type": "ecr"
            },
            "volumesFrom": []
        },
        {
            "ApplyingError": null,
            "Command": [
                "/usr/bin/php",
                "/var/www/website/symfony",
                "utils:purgeNotification",
                "--env=stage"
            ],
            "Cpu": 0,
            "EntryPoint": null,
            "Essential": true,
            "Image": "repo.amazonaws.com/php:PR-1884-1-stage-cron",
            "ImageID": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
            "IsInternal": "NORMAL",
            "KnownExitCode": 137,
            "KnownPortBindings": null,
            "KnownStatus": "STOPPED",
            "Links": null,
            "LogsAuthStrategy": "",
            "Memory": 150,
            "Name": "env-stage-utils_purgeNotification",
            "RunDependencies": null,
            "SentStatus": "STOPPED",
            "TransitionDependencySet": {
                "ContainerDependencies": null
            },
            "desiredStatus": "STOPPED",
            "dockerConfig": {
                "config": "{}",
                "hostConfig": "{\"DnsSearch\":[\"int.stage.example.com\"],\"LogConfig\":{\"Type\":\"awslogs\",\"Config\":{\"awslogs-group\":\"env-stage-workers\",\"awslogs-stream\":\"php/env-stage-utils_purgeNotification/da8ea8c0-a080-44cd-9a32-6f850b797da3\",\"awslogs-region\":\"eu-central-1\"}},\"MemoryReservation\":157286400,\"CapAdd\":[],\"CapDrop\":[]}",
                "version": "1.21"
            },
            "environment": {
                "PHP_MEMORY_LIMIT": "128M",
                "TZ": "Europe/Paris",
            },
            "metadataFileUpdated": false,
            "mountPoints": [
                {
                    "containerPath": "/var/www/website/web/uploads",
                    "readOnly": false,
                    "sourceVolume": "example-uploads"
                }
            ],
            "overrides": {
                "command": null
            },
            "portMappings": [],
            "registryAuthentication": {
                "ecrAuthData": {
                    "endpointOverride": "",
                    "region": "eu-central-1",
                    "registryId": "123456890123",
                    "useExecutionRole": false
                },
                "type": "ecr"
            },
            "volumesFrom": []
        },
        {
            "ApplyingError": null,
            "Command": [
                "/usr/bin/php",
                "/var/www/website/symfony",
                "mails:dailyWorkshopRecap",
                "--env=stage"
            ],
            "Cpu": 0,
            "EntryPoint": null,
            "Essential": true,
            "Image": "repo.amazonaws.com/php:PR-1884-1-stage-cron",
            "ImageID": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
            "IsInternal": "NORMAL",
            "KnownExitCode": null,
            "KnownPortBindings": null,
            "KnownStatus": "STOPPED",
            "Links": null,
            "LogsAuthStrategy": "",
            "Memory": 150,
            "Name": "env-stage-mails_dailyWorkshopRecap",
            "RunDependencies": null,
            "SentStatus": "STOPPED",
            "TransitionDependencySet": {
                "ContainerDependencies": null
            },
            "desiredStatus": "STOPPED",
            "dockerConfig": {
                "config": "{}",
                "hostConfig": "{\"DnsSearch\":[\"int.stage.example.com\"],\"LogConfig\":{\"Type\":\"awslogs\",\"Config\":{\"awslogs-group\":\"env-stage-workers\",\"awslogs-stream\":\"php/env-stage-mails_dailyWorkshopRecap/da8ea8c0-a080-44cd-9a32-6f850b797da3\",\"awslogs-region\":\"eu-central-1\"}},\"MemoryReservation\":157286400,\"CapAdd\":[],\"CapDrop\":[]}",
                "version": "1.21"
            },
            "environment": {
                "PHP_MEMORY_LIMIT": "128M",
                "TZ": "Europe/Paris",
            },
            "metadataFileUpdated": false,
            "mountPoints": [
                {
                    "containerPath": "/var/www/website/web/uploads",
                    "readOnly": false,
                    "sourceVolume": "example-uploads"
                }
            ],
            "overrides": {
                "command": null
            },
            "portMappings": [],
            "registryAuthentication": {
                "ecrAuthData": {
                    "endpointOverride": "",
                    "region": "eu-central-1",
                    "registryId": "123456890123",
                    "useExecutionRole": false
                },
                "type": "ecr"
            },
            "volumesFrom": []
        },
        {
            "ApplyingError": null,
            "Command": [
                "/usr/bin/php",
                "/var/www/website/symfony",
                "mails:weeklyTicketingRecap",
                "--env=stage"
            ],
            "Cpu": 0,
            "EntryPoint": null,
            "Essential": true,
            "Image": "repo.amazonaws.com/php:PR-1884-1-stage-cron",
            "ImageID": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
            "IsInternal": "NORMAL",
            "KnownExitCode": 137,
            "KnownPortBindings": null,
            "KnownStatus": "STOPPED",
            "Links": null,
            "LogsAuthStrategy": "",
            "Memory": 150,
            "Name": "env-stage-mails_weeklyTicketingRecap",
            "RunDependencies": null,
            "SentStatus": "STOPPED",
            "TransitionDependencySet": {
                "ContainerDependencies": null
            },
            "desiredStatus": "STOPPED",
            "dockerConfig": {
                "config": "{}",
                "hostConfig": "{\"DnsSearch\":[\"int.stage.example.com\"],\"LogConfig\":{\"Type\":\"awslogs\",\"Config\":{\"awslogs-group\":\"env-stage-workers\",\"awslogs-stream\":\"php/env-stage-mails_weeklyTicketingRecap/da8ea8c0-a080-44cd-9a32-6f850b797da3\",\"awslogs-region\":\"eu-central-1\"}},\"MemoryReservation\":157286400,\"CapAdd\":[],\"CapDrop\":[]}",
                "version": "1.21"
            },
            "environment": {
                "PHP_MEMORY_LIMIT": "128M",
                "TZ": "Europe/Paris",
            },
            "metadataFileUpdated": false,
            "mountPoints": [
                {
                    "containerPath": "/var/www/website/web/uploads",
                    "readOnly": false,
                    "sourceVolume": "example-uploads"
                }
            ],
            "overrides": {
                "command": null
            },
            "portMappings": [],
            "registryAuthentication": {
                "ecrAuthData": {
                    "endpointOverride": "",
                    "region": "eu-central-1",
                    "registryId": "123456890123",
                    "useExecutionRole": false
                },
                "type": "ecr"
            },
            "volumesFrom": []
        },
        {
            "ApplyingError": null,
            "Command": [
                "/usr/bin/php",
                "/var/www/website/symfony",
                "utils:manageAssoAccount",
                "--env=stage"
            ],
            "Cpu": 0,
            "EntryPoint": null,
            "Essential": true,
            "Image": "repo.amazonaws.com/php:PR-1884-1-stage-cron",
            "ImageID": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
            "IsInternal": "NORMAL",
            "KnownExitCode": null,
            "KnownPortBindings": null,
            "KnownStatus": "STOPPED",
            "Links": null,
            "LogsAuthStrategy": "",
            "Memory": 150,
            "Name": "env-stage-utils_manageAssoAccount",
            "RunDependencies": null,
            "SentStatus": "STOPPED",
            "TransitionDependencySet": {
                "ContainerDependencies": null
            },
            "desiredStatus": "STOPPED",
            "dockerConfig": {
                "config": "{}",
                "hostConfig": "{\"DnsSearch\":[\"int.stage.example.com\"],\"LogConfig\":{\"Type\":\"awslogs\",\"Config\":{\"awslogs-group\":\"env-stage-workers\",\"awslogs-stream\":\"php/env-stage-utils_manageAssoAccount/da8ea8c0-a080-44cd-9a32-6f850b797da3\",\"awslogs-region\":\"eu-central-1\"}},\"MemoryReservation\":157286400,\"CapAdd\":[],\"CapDrop\":[]}",
                "version": "1.21"
            },
            "environment": {
                "PHP_MEMORY_LIMIT": "128M",
                "TZ": "Europe/Paris",
            },
            "metadataFileUpdated": false,
            "mountPoints": [
                {
                    "containerPath": "/var/www/website/web/uploads",
                    "readOnly": false,
                    "sourceVolume": "example-uploads"
                }
            ],
            "overrides": {
                "command": null
            },
            "portMappings": [],
            "registryAuthentication": {
                "ecrAuthData": {
                    "endpointOverride": "",
                    "region": "eu-central-1",
                    "registryId": "123456890123",
                    "useExecutionRole": false
                },
                "type": "ecr"
            },
            "volumesFrom": []
        },
        {
            "ApplyingError": null,
            "Command": [
                "/usr/bin/php",
                "/var/www/website/symfony",
                "mails:profileNotification",
                "--env=stage"
            ],
            "Cpu": 0,
            "EntryPoint": null,
            "Essential": true,
            "Image": "repo.amazonaws.com/php:PR-1884-1-stage-cron",
            "ImageID": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
            "IsInternal": "NORMAL",
            "KnownExitCode": 0,
            "KnownPortBindings": null,
            "KnownStatus": "STOPPED",
            "Links": null,
            "LogsAuthStrategy": "",
            "Memory": 150,
            "Name": "env-stage-mails_profileNotification",
            "RunDependencies": null,
            "SentStatus": "STOPPED",
            "TransitionDependencySet": {
                "ContainerDependencies": null
            },
            "desiredStatus": "STOPPED",
            "dockerConfig": {
                "config": "{}",
                "hostConfig": "{\"DnsSearch\":[\"int.stage.example.com\"],\"LogConfig\":{\"Type\":\"awslogs\",\"Config\":{\"awslogs-group\":\"env-stage-workers\",\"awslogs-stream\":\"php/env-stage-mails_profileNotification/da8ea8c0-a080-44cd-9a32-6f850b797da3\",\"awslogs-region\":\"eu-central-1\"}},\"MemoryReservation\":157286400,\"CapAdd\":[],\"CapDrop\":[]}",
                "version": "1.21"
            },
            "environment": {
                "PHP_MEMORY_LIMIT": "128M",
                "TZ": "Europe/Paris",
            },
            "metadataFileUpdated": false,
            "mountPoints": [
                {
                    "containerPath": "/var/www/website/web/uploads",
                    "readOnly": false,
                    "sourceVolume": "example-uploads"
                }
            ],
            "overrides": {
                "command": null
            },
            "portMappings": [],
            "registryAuthentication": {
                "ecrAuthData": {
                    "endpointOverride": "",
                    "region": "eu-central-1",
                    "registryId": "123456890123",
                    "useExecutionRole": false
                },
                "type": "ecr"
            },
            "volumesFrom": []
        }
    ],
    "DesiredStatus": "STOPPED",
    "ENI": null,
    "ExecutionStoppedAt": "2018-05-21T13:26:15.969730897Z",
    "Family": "env-stage-cron-1",
    "KnownStatus": "STOPPED",
    "KnownTime": "2018-05-21T13:26:46.556684068Z",
    "MemoryCPULimitsEnabled": true,
    "PullStartedAt": "2018-05-21T13:26:13.842619731Z",
    "PullStoppedAt": "2018-05-21T13:26:14.100596311Z",
    "SentStatus": "STOPPED",
    "StartSequenceNumber": 1746,
    "StopSequenceNumber": 0,
    "Version": "179",
    "executionCredentialsID": "",
    "volumes": [
        {
            "host": {
                "sourcePath": "/var/example-uploads"
            },
            "name": "example-uploads"
        }
    ]
},

Container info for 257e5551a323:

[
    {
        "Id": "257e5551a32366def005dd45979d0f19cdb30f09f6302af19a1c7b29238e0dae",
        "Created": "2018-05-21T13:26:14.087419271Z",
        "Path": "/usr/local/bin/entrypoint.sh",
        "Args": [
            "/usr/bin/php",
            "/var/www/example/symfony",
            "mails:task",
            "--env=stage"
        ],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 31682,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2018-05-21T13:26:16.284080656Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:0255089b72be4ea548854ea22a742b9228abd7d1f64b23b6307df3f8feadd2bb",
        "ResolvConfPath": "/var/lib/docker/containers/257e5551a32366def005dd45979d0f19cdb30f09f6302af19a1c7b29238e0dae/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/257e5551a32366def005dd45979d0f19cdb30f09f6302af19a1c7b29238e0dae/hostname",
        "HostsPath": "/var/lib/docker/containers/257e5551a32366def005dd45979d0f19cdb30f09f6302af19a1c7b29238e0dae/hosts",
        "LogPath": "",
        "Name": "/ecs-example-stage-cron-1-179-example-stage-mailstask-dea2dccf9294c0bc1c00",
        "RestartCount": 0,
        "Driver": "devicemapper",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "/var/example-uploads:/var/www/example/web/uploads"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "awslogs",
                "Config": {
                    "awslogs-group": "example-stage-workers",
                    "awslogs-region": "eu-central-1",
                    "awslogs-stream": "php/example-stage-mails_task/da8ea8c0-a080-44cd-9a32-6f850b797da3"
                }
            },
            "NetworkMode": "default",
            "PortBindings": null,
            "RestartPolicy": {
                "Name": "",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "Dns": null,
            "DnsOptions": null,
            "DnsSearch": [
                "int.stage.example.com"
            ],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "shareable",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 2,
            "Memory": 157286400,
            "NanoCpus": 0,
            "CgroupParent": "/ecs/da8ea8c0-a080-44cd-9a32-6f850b797da3",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": null,
            "DeviceCgroupRules": null,
            "DiskQuota": 0,
            "KernelMemory": 0,
            "MemoryReservation": 157286400,
            "MemorySwap": 314572800,
            "MemorySwappiness": 0,
            "OomKillDisable": false,
            "PidsLimit": 0,
            "Ulimits": [
                {
                    "Name": "nofile",
                    "Hard": 4096,
                    "Soft": 1024
                }
            ],
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0
        },
        "GraphDriver": {
            "Data": {
                "DeviceId": "6798",
                "DeviceName": "docker-202:1-263286-a55538d948074e8082593e29d08ea4d2d57bd81fe3efa500d2f2d4a54c080ad8",
                "DeviceSize": "10737418240"
            },
            "Name": "devicemapper"
        },
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/var/example-uploads",
                "Destination": "/var/www/example/web/uploads",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Type": "volume",
                "Name": "ef69e07b38ac24fb9d371db44261cb259f8627b99a0bcbb1976a039c1c31deb1",
                "Source": "/var/lib/docker/volumes/ef69e07b38ac24fb9d371db44261cb259f8627b99a0bcbb1976a039c1c31deb1/_data",
                "Destination": "/var/composer-cache",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            }
        ],
        "Config": {
            "Hostname": "257e5551a323",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "9000/tcp": {}
            },
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PHP_MEMORY_LIMIT=128M",
                "TZ=Europe/Paris",
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
            ],
            "Cmd": [
                "/usr/bin/php",
                "/var/www/example/symfony",
                "mails:task",
                "--env=stage"
            ],
            "Image": "124766415242.dkr.ecr.eu-central-1.amazonaws.com/php:PR-1884-1-stage-cron",
            "Volumes": {
                "/var/composer-cache": {},
                "/var/www/example/web/uploads": {}
            },
            "WorkingDir": "/var/www/example",
            "Entrypoint": [
                "/usr/local/bin/entrypoint.sh"
            ],
            "OnBuild": null,
            "Labels": {
                "com.amazonaws.ecs.cluster": "example-stage",
                "com.amazonaws.ecs.container-name": "example-stage-mails_task",
                "com.amazonaws.ecs.task-arn": "arn:aws:ecs:eu-central-1:124766415242:task/da8ea8c0-a080-44cd-9a32-6f850b797da3",
                "com.amazonaws.ecs.task-definition-family": "example-stage-cron-1",
                "com.amazonaws.ecs.task-definition-version": "179"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "c9a25205efbd78c301295f7d950e30468b4b80259c6e99703ec947299fe23612",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {
                "9000/tcp": null
            },
            "SandboxKey": "/var/run/docker/netns/c9a25205efbd",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "ee47c3b433f27926a17b9c9c4c222f00dfa14a3614c337b71c9f1f3b18c704d1",
            "Gateway": "172.17.0.1",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "172.17.0.7",
            "IPPrefixLen": 16,
            "IPv6Gateway": "",
            "MacAddress": "02:42:ac:11:00:07",
            "Networks": {
                "bridge": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "NetworkID": "dc279972e1950c8f86f380f434f0eb73663c9ab5f40809db776d361833b95f20",
                    "EndpointID": "ee47c3b433f27926a17b9c9c4c222f00dfa14a3614c337b71c9f1f3b18c704d1",
                    "Gateway": "172.17.0.1",
                    "IPAddress": "172.17.0.7",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "02:42:ac:11:00:07",
                    "DriverOpts": null
                }
            }
        }
    }
]
@gileri gileri changed the title Containers are not stopped Some containers are not stopped during service update May 21, 2018
@richardpen
Copy link

@gileri I was trying to reproduce this issue with a service that has desired count 1, minimumHealthyPercent=50 and maximumHealthyPercent=100 but wasn't able to reproduce this issue. It would be helpful if you can share the service name, the maximumHealthyPercent, minimumHealthyPercent and the agent logs that experienced this issue? You can send it to me: penyin (at) amazon.com

Thanks,
Peng

@gileri
Copy link
Author

gileri commented May 22, 2018

Sorry, I wrongly formatted the code blocks ; they are now fixed and include :

Minimum healthy percent = 0%
Maximum healthy percent = 100%
Desired count = 1

I removed sensitive informations from the ecs-agent logs downloadable here.

The service name is example-stage-cron-1.

@haikuoliu
Copy link
Contributor

Hi @gileri,

Sorry for my late response.

I investigated on the logs and found out the root cause: the task is started and then there is a stop immediately. Agent sets the container to be stopped, and then docker sends a docker change event to Agent indicating the container is running, Agent is supposed to stop the container again in this case, and this is handle by this go routine. However, it only handles once due to some reasons.

I will mark it as a bug.

Thanks,
Haikuo

@gileri
Copy link
Author

gileri commented May 31, 2018

Thank you @haikuoliu for the analysis ! I'm not familiar with go or ECS code but I sure can provide additional debug logs or tests.

@haikuoliu
Copy link
Contributor

@gileri

I think the logs that you provided are enough, the bug seems clear there and we will let you know when we fix it.

I saw from logs that the containers in your task gets stopped too quick, this will cause the bug that container cannot be stopped. Try to avoid this situation will be a mitigation.

Thanks for bringing this to our attention!

@adnxn
Copy link
Contributor

adnxn commented Aug 7, 2018

closing issue, fix is included with latest release.

@adnxn adnxn closed this as completed Aug 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants