Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the task status and the container KnownStatus when ECS_PULL_DEPENDENT_CONTAINERS_UPFRONT is enabled #2800

Merged
merged 1 commit into from
Feb 2, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 33 additions & 1 deletion agent/engine/docker_task_engine.go
Original file line number Diff line number Diff line change
Expand Up @@ -840,6 +840,13 @@ func (engine *DockerTaskEngine) pullContainer(task *apitask.Task, container *api

}

// No pull image is required, the cached image will be used.
// Add the container that uses the cached image to the pulled container state.
dockerContainer := &apicontainer.DockerContainer{
Container: container,
}
engine.state.AddPulledContainer(dockerContainer, task)

// No pull image is required, just update container reference and use cached image.
engine.updateContainerReference(false, container, task.Arn)
// Return the metadata without any error
Expand Down Expand Up @@ -958,12 +965,37 @@ func (engine *DockerTaskEngine) pullAndUpdateContainerReference(task *apitask.Ta
return metadata
}
pullSucceeded := metadata.Error == nil
if pullSucceeded {
findCachedImage := false
if !pullSucceeded {
// If Agent failed to pull an image when
// 1. DependentContainersPullUpfront is enabled
// 2. ImagePullBehavior is not set to always
// search the image in local cached images
if engine.cfg.DependentContainersPullUpfront.Enabled() && engine.cfg.ImagePullBehavior != config.ImagePullAlwaysBehavior {
if _, err := engine.client.InspectImage(container.Image); err != nil {
seelog.Errorf("Task engine [%s]: failed to find cached image %s for container %s",
task.Arn, container.Image, container.Name)
// Stop the task if the container is an essential container,
// and the image is not available in both remote and local caches
if container.IsEssential() {
task.SetDesiredStatus(apitaskstatus.TaskStopped)
engine.emitTaskEvent(task, fmt.Sprintf("%s: %s", metadata.Error.ErrorName(), metadata.Error.Error()))
}
return dockerapi.DockerContainerMetadata{Error: metadata.Error}
}
seelog.Infof("Task engine [%s]: found cached image %s, use it directly for container %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if engine.client.InspectImage(container.Image) doesn't return an error we can be sure the image exists?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tracing back with the function call, I find function InspectImage, defined in agent/dockerclient/dockerapi/docker_client.go, returns image data received from the client.ImageInspectWithRaw() call.

func (dg *dockerGoClient) InspectImage(image string) (*types.ImageInspect, error) {
	defer metrics.MetricsEngineGlobal.RecordDockerMetric("INSPECT_IMAGE")()
	client, err := dg.sdkDockerClient()
	if err != nil {
		return nil, err
	}
	imageData, _, err := client.ImageInspectWithRaw(dg.context, image)
	return &imageData, err
}

The ImageInspectWithRaws, defined in agent/vendor/github.com/docker/docker/client/image_inspect.go, returns low-level information about an image based on the given image name or ID. If the image cannot be found through docker engine api, 404 error response will be returned. Details can be found here.

// ImageInspectWithRaw returns the image information and its raw representation.
func (cli *Client) ImageInspectWithRaw(ctx context.Context, imageID string) (types.ImageInspect, []byte, error) {
	if imageID == "" {
		return types.ImageInspect{}, nil, objectNotFoundError{object: "image", id: imageID}
	}
	serverResp, err := cli.get(ctx, "/images/"+imageID+"/json", nil, nil)
	if err != nil {
		return types.ImageInspect{}, nil, wrapResponseError(err, serverResp, "image", imageID)
	}
	defer ensureReaderClosed(serverResp)

	body, err := ioutil.ReadAll(serverResp.body)
	if err != nil {
		return types.ImageInspect{}, nil, err
	}

	var response types.ImageInspect
	rdr := bytes.NewReader(body)
	err = json.NewDecoder(rdr).Decode(&response)
	return response, body, err
}

We have implemented this to verify whether the image exists or not in func imagePullRequired when ImagePullBehavior is prefer-cached.

func (engine *DockerTaskEngine) imagePullRequired(imagePullBehavior config.ImagePullBehaviorType,
	container *apicontainer.Container,
	taskArn string) bool {
	switch imagePullBehavior {
	case config.ImagePullOnceBehavior:
                  ...
	case config.ImagePullPreferCachedBehavior:
		// If the behavior is prefer cached, don't pull if we found cached image
		// by inspecting the image.
		_, err := engine.client.InspectImage(container.Image)
		if err != nil {
			return true
		}
		seelog.Infof("Task engine [%s]: found cached image %s, use it directly for container %s",
			taskArn, container.Image, container.Name)
		return false
	default:
		// Need to pull the image for always and default agent pull behavior
		return true
	}
}

Based on these info, I think it is ok to verify cached images through engine.client.InspectImage(container.Image). Please let me know if anything is still missing here. Thank you!

task.Arn, container.Image, container.Name)
findCachedImage = true
}
}

if pullSucceeded || findCachedImage {
dockerContainer := &apicontainer.DockerContainer{
Container: container,
}
engine.state.AddPulledContainer(dockerContainer, task)
}

engine.updateContainerReference(pullSucceeded, container, task.Arn)
return metadata
}
Expand Down
153 changes: 122 additions & 31 deletions agent/engine/docker_task_engine_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ import (
apicontainer "github.com/aws/amazon-ecs-agent/agent/api/container"
apicontainerstatus "github.com/aws/amazon-ecs-agent/agent/api/container/status"
apieni "github.com/aws/amazon-ecs-agent/agent/api/eni"
apierrors "github.com/aws/amazon-ecs-agent/agent/api/errors"
apitask "github.com/aws/amazon-ecs-agent/agent/api/task"
apitaskstatus "github.com/aws/amazon-ecs-agent/agent/api/task/status"
"github.com/aws/amazon-ecs-agent/agent/asm"
Expand Down Expand Up @@ -1685,41 +1686,131 @@ func TestUpdateContainerReference(t *testing.T) {
}

// TestPullAndUpdateContainerReference checks whether a container is added to task engine state when
// pullSucceeded and DependentContainersPullUpfront is enabled.
// Test # | Image availability | DependentContainersPullUpfront | ImagePullBehavior
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Although I think we removed the UTs for when DependentContainersPullUpfront is NOT enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DependentContainersPullUpfrontDisabledWithRemoteImage is added to TestPullAndUpdateContainerReference. Thank you!

// -----------------------------------------------------------------------------------
// 1 | remote | enabled | default
// 2 | remote | disabled | default
// 3 | local | enabled | default
// 4 | local | enabled | once
// 5 | local | enabled | prefer-cached
// 6 | local | enabled | always
func TestPullAndUpdateContainerReference(t *testing.T) {
ctx, cancel := context.WithCancel(context.TODO())
defer cancel()
cfg := &config.Config{
DependentContainersPullUpfront: config.BooleanDefaultFalse{Value: config.ExplicitlyEnabled},
testcases := []struct {
Name string
ImagePullUpfront config.BooleanDefaultFalse
ImagePullBehavior config.ImagePullBehaviorType
ImageState *image.ImageState
ImageInspect *types.ImageInspect
InspectImage bool
NumOfPulledContainer int
PullImageErr apierrors.NamedError
}{
{
Name: "DependentContainersPullUpfrontEnabledWithRemoteImage",
ImagePullUpfront: config.BooleanDefaultFalse{Value: config.ExplicitlyEnabled},
ImagePullBehavior: config.ImagePullDefaultBehavior,
ImageState: &image.ImageState{
Image: &image.Image{ImageID: "id"},
},
InspectImage: false,
NumOfPulledContainer: 1,
PullImageErr: nil,
},
{
Name: "DependentContainersPullUpfrontDisabledWithRemoteImage",
ImagePullUpfront: config.BooleanDefaultFalse{Value: config.ExplicitlyDisabled},
ImagePullBehavior: config.ImagePullDefaultBehavior,
ImageState: &image.ImageState{
Image: &image.Image{ImageID: "id"},
},
InspectImage: false,
NumOfPulledContainer: 1,
PullImageErr: nil,
},
{
Name: "DependentContainersPullUpfrontEnabledWithCachedImage",
ImagePullUpfront: config.BooleanDefaultFalse{Value: config.ExplicitlyEnabled},
ImagePullBehavior: config.ImagePullDefaultBehavior,
ImageState: nil,
ImageInspect: nil,
InspectImage: true,
NumOfPulledContainer: 1,
PullImageErr: dockerapi.CannotPullContainerError{fmt.Errorf("error")},
},
{
Name: "DependentContainersPullUpfrontEnabledAndImagePullOnceBehavior",
ImagePullUpfront: config.BooleanDefaultFalse{Value: config.ExplicitlyEnabled},
ImagePullBehavior: config.ImagePullOnceBehavior,
ImageState: nil,
ImageInspect: nil,
InspectImage: true,
NumOfPulledContainer: 1,
PullImageErr: dockerapi.CannotPullContainerError{fmt.Errorf("error")},
},
{
Name: "DependentContainersPullUpfrontEnabledAndImagePullPreferCachedBehavior",
ImagePullUpfront: config.BooleanDefaultFalse{Value: config.ExplicitlyEnabled},
ImagePullBehavior: config.ImagePullPreferCachedBehavior,
ImageState: nil,
ImageInspect: nil,
InspectImage: true,
NumOfPulledContainer: 1,
PullImageErr: dockerapi.CannotPullContainerError{fmt.Errorf("error")},
},
{
Name: "DependentContainersPullUpfrontEnabledAndImagePullAlwaysBehavior",
ImagePullUpfront: config.BooleanDefaultFalse{Value: config.ExplicitlyEnabled},
ImagePullBehavior: config.ImagePullAlwaysBehavior,
ImageState: nil,
ImageInspect: nil,
InspectImage: false,
NumOfPulledContainer: 0,
PullImageErr: dockerapi.CannotPullContainerError{fmt.Errorf("error")},
},
}
ctrl, client, _, privateTaskEngine, _, imageManager, _ := mocks(t, ctx, cfg)
defer ctrl.Finish()

taskEngine, _ := privateTaskEngine.(*DockerTaskEngine)
taskEngine._time = nil
imageName := "image"
taskArn := "taskArn"
container := &apicontainer.Container{
Type: apicontainer.ContainerNormal,
Image: imageName,
}
task := &apitask.Task{
Arn: taskArn,
Containers: []*apicontainer.Container{container},
}
imageState := &image.ImageState{
Image: &image.Image{ImageID: "id"},
}
for _, tc := range testcases {
t.Run(tc.Name, func(t *testing.T) {
ctx, cancel := context.WithCancel(context.TODO())
defer cancel()
cfg := &config.Config{
DependentContainersPullUpfront: tc.ImagePullUpfront,
ImagePullBehavior: tc.ImagePullBehavior,
}
ctrl, client, _, privateTaskEngine, _, imageManager, _ := mocks(t, ctx, cfg)
defer ctrl.Finish()

client.EXPECT().PullImage(gomock.Any(), imageName, gomock.Any(), gomock.Any())
imageManager.EXPECT().RecordContainerReference(container)
imageManager.EXPECT().GetImageStateFromImageName(imageName).Return(imageState, true)
metadata := taskEngine.pullAndUpdateContainerReference(task, container)
pulledContainersMap, ok := taskEngine.State().PulledContainerMapByArn(taskArn)
require.True(t, ok, "no container found in the agent state")
require.Len(t, pulledContainersMap, 1)
assert.True(t, imageState.PullSucceeded, "PullSucceeded set to false")
assert.Equal(t, dockerapi.DockerContainerMetadata{}, metadata, "expected empty metadata")
taskEngine, _ := privateTaskEngine.(*DockerTaskEngine)
taskEngine._time = nil
imageName := "image"
taskArn := "taskArn"
container := &apicontainer.Container{
Type: apicontainer.ContainerNormal,
Image: imageName,
Essential: true,
}

task := &apitask.Task{
Arn: taskArn,
Containers: []*apicontainer.Container{container},
}

client.EXPECT().PullImage(gomock.Any(), imageName, nil, gomock.Any()).
Return(dockerapi.DockerContainerMetadata{Error: tc.PullImageErr})

if tc.InspectImage {
client.EXPECT().InspectImage(imageName).Return(tc.ImageInspect, nil)
}

imageManager.EXPECT().RecordContainerReference(container)
imageManager.EXPECT().GetImageStateFromImageName(imageName).Return(tc.ImageState, false)
metadata := taskEngine.pullAndUpdateContainerReference(task, container)
pulledContainersMap, _ := taskEngine.State().PulledContainerMapByArn(taskArn)
require.Len(t, pulledContainersMap, tc.NumOfPulledContainer)
assert.Equal(t, dockerapi.DockerContainerMetadata{Error: tc.PullImageErr},
metadata, "expected metadata with error")
})
}
}

// TestMetadataFileUpdatedAgentRestart checks whether metadataManager.Update(...) is
Expand Down