Use upstream ecs-agent types for deserializing API responses #75

isker · 2024-10-05T20:43:12Z

This project rolls its own types for deserializing ECS task metadata and container stats responses. Maintaining these types can be tedious, as the documentation of these API endpoints is underspecified, in the sense that properties included in API responses are broadly available in the documentation, but their precise types (including whether properties are optional) are sometimes not. If we roll our own types, we are stuck reverse-engineering the specifics of the ECS API responses, which is made more tedious by the fact that these can differ between EC2 and Fargate.

Good news: rolling our own types is not necessary. The ECS Agent is open source and written in Go. We can depend on it as a library, purely to get at the structs that it uses for API responses. We were already doing a similar thing for docker stats; this is just more comprehensive. (The ECS Agent project itself depends on the same docker library we were using for these stats, and includes them in its API responses.)

Switching to these ECS Agent types has revealed situations in which our types were incorrect, silently relying on implicit type conversions of JSON values in Go's JSON deserializer. One of these situations was resulting in actually invalid data being served: ECS tasks on EC2 need not specify task-level resource limits at all, such that these properties are optional on the JSON response, but our types for them were not pointers, such that we were incorrectly reporting derived metrics as zero, when they really should not exist at all.

The downsides of doing this:

The ECS Agent project has no detectable Go module version. I am not an expert on this, but I think it's related to a single repo containing multiple Go modules. I don't think this is a big deal, as the existing docker dependency already did not have a Go module version.
We have to upgrade to go 1.21, as the ECS agent project declares that it requires it in its go.mod.
Binary size grows, about 12MB -> 18MB on my laptop. I am surprised that simply switching to use these structs blew up binary size so much, but I don't think binary size is taken very seriously in the Go ecosystem anyway, so I don't think this is a big problem.

I think these downsides are worth it in that, going forward, we can more reliably develop metrics derived from ECS Agent API responses, because we're using types that are much more likely to be correct.

I plan to add more metrics, and improve existing ones, using these types in the future.

I have validated these changes by recording output on EC2 and Fargate before and after this change here:
https://github.com/isker/ecs-exporter-cdk/tree/master/experiments/use-official-types. The resulting diffs are as expected.

.circleci/config.yml

SuperQ

Great! I'm happy that there is a better, officially supported, way of parsing the responses/types.

A minor nit on the Go version changes.

.github/workflows/golangci-lint.yml

go.mod

This project rolls its own types for deserializing ECS task metadata and container stats responses. Maintaining these types can be tedious, as the documentation of these API endpoints is underspecified, in the sense that properties included in API responses are broadly available in the documentation, but their precise types (including whether properties are optional) are sometimes not. If we roll our own types, we are stuck reverse-engineering the specifics of the ECS API responses, which is made more tedious by the fact that these can differ between EC2 and Fargate. Good news: rolling our own types is not necessary. The ECS Agent is open source and written in Go. We can depend on it as a library, purely to get at the structs that it uses for API responses. We were already doing a similar thing for docker stats; this is just more comprehensive. (The ECS Agent project itself depends on the same docker library we were using for these stats, and includes them in its API responses.) Switching to these ECS Agent types has revealed situations in which our types were incorrect, silently relying on implicit type conversions of JSON values in Go's JSON deserializer. One of these situations was resulting in actually invalid data being served: ECS tasks on EC2 need not specify task-level resource limits at all, such that these properties are optional on the JSON response, but our types for them were not pointers, such that we were incorrectly reporting derived metrics as zero, when they really should not exist at all. The downsides of doing this: - The ECS Agent project has no detectable Go module version. I am not an expert on this, but I think it's related to a single repo containing multiple Go modules. I don't think this is a big deal, as the existing docker dependency already did not have a Go module version. - We have to upgrade to go 1.21, as the ECS agent project declares that it requires it in its go.mod. - Binary size grows, about 12MB -> 18MB on my laptop. I am surprised that simply switching to use these structs blew up binary size so much, but I don't think binary size is taken very seriously in the Go ecosystem anyway, so I don't think this is a big problem. I think these downsides are worth it in that, going forward, we can more reliably develop metrics derived from ECS Agent API responses, because we're using types that are much more likely to be correct. I plan to add more metrics, and improve existing ones, using these types in the future. I have validated these changes by recording output on EC2 and Fargate before and after this change here: https://github.com/isker/ecs-exporter-cdk/tree/master/experiments/use-official-types. The resulting diffs are as expected. Signed-off-by: Ian Kerins <git@isk.haus>

isker · 2024-10-12T08:09:05Z

Hmm, not sure what to do about that lint failure.

SuperQ · 2024-10-12T09:00:36Z

I'm going to ignore the lint error, since this will get fixed with the next sync of the lint config.

* [CHANGE] Use upstream ecs-agent types for deserializing API responses #75 * [CHANGE] Update exporter boilerplate #77 * [ENHANCEMENT] Add additional metrics #53 Signed-off-by: SuperQ <superq@gmail.com>

isker force-pushed the use-official-types branch 2 times, most recently from 6e72692 to 6c70c24 Compare October 5, 2024 20:44

isker mentioned this pull request Oct 7, 2024

Explanation of ecs_memory_limit_bytes #35

Closed

SuperQ reviewed Oct 11, 2024

View reviewed changes

.circleci/config.yml Outdated Show resolved Hide resolved

SuperQ requested changes Oct 11, 2024

View reviewed changes

.github/workflows/golangci-lint.yml Outdated Show resolved Hide resolved

go.mod Outdated Show resolved Hide resolved

isker force-pushed the use-official-types branch from 6c70c24 to 1cbfbab Compare October 11, 2024 22:02

isker force-pushed the use-official-types branch from 1cbfbab to 1b6f039 Compare October 11, 2024 22:04

SuperQ approved these changes Oct 12, 2024

View reviewed changes

SuperQ merged commit cfe0be6 into prometheus-community:main Oct 12, 2024
3 of 4 checks passed

SuperQ mentioned this pull request Oct 13, 2024

Release v0.3.0 #79

Merged

isker deleted the use-official-types branch October 17, 2024 11:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use upstream ecs-agent types for deserializing API responses #75

Use upstream ecs-agent types for deserializing API responses #75

isker commented Oct 5, 2024 •

edited

Loading

SuperQ left a comment

isker commented Oct 12, 2024

SuperQ commented Oct 12, 2024

Use upstream ecs-agent types for deserializing API responses #75

Use upstream ecs-agent types for deserializing API responses #75

Conversation

isker commented Oct 5, 2024 • edited Loading

SuperQ left a comment

Choose a reason for hiding this comment

isker commented Oct 12, 2024

SuperQ commented Oct 12, 2024

isker commented Oct 5, 2024 •

edited

Loading