-
-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: optimizes file copies to and from containers #2450
feat: optimizes file copies to and from containers #2450
Conversation
Signed-off-by: Adrian Cole <adrian@tetrate.io>
✅ Deploy Preview for testcontainers-go ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Signed-off-by: Adrian Cole <adrian@tetrate.io>
I think the failure is a flake?
|
Hey @codefromthecrypt I've done a super quick and dirty benchmark for this improvement, and I'm sharing here the results. I'm not doing them to demonstrate anything against the PR, but to learn myself from the process, as I've been thinking about benchmarks more and more in the recent times. So please take this as me doing an exercise to learn, and I'd love to receive feedback if possible. Here I go! Env
CodeI created a func BenchmarkLoadImages(b *testing.B) {
// Give up to three minutes to run this test
ctx, cancel := context.WithDeadline(context.Background(), time.Now().Add(3*time.Minute))
defer cancel()
k3sContainer, err := k3s.RunContainer(ctx,
testcontainers.WithImage("docker.io/rancher/k3s:v1.27.1-k3s1"),
)
if err != nil {
b.Fatal(err)
}
// Clean up the container
defer func() {
if err := k3sContainer.Terminate(ctx); err != nil {
b.Fatal(err)
}
}()
provider, err := testcontainers.ProviderDocker.GetProvider()
if err != nil {
b.Fatal(err)
}
// ensure nginx image is available locally
err = provider.PullImage(ctx, "nginx")
if err != nil {
b.Fatal(err)
}
b.ResetTimer() // Reset the benchmark timer
b.Run("Old copy method", func(b *testing.B) {
for i := 0; i < b.N; i++ {
ctx, cancel := context.WithDeadline(context.Background(), time.Now().Add(3*time.Minute))
defer cancel()
err := k3sContainer.LoadImagesOld(ctx, "nginx")
if err != nil {
b.Fatal(err)
}
}
})
b.Run("New copy method", func(b *testing.B) {
for i := 0; i < b.N; i++ {
ctx, cancel := context.WithDeadline(context.Background(), time.Now().Add(3*time.Minute))
defer cancel()
err := k3sContainer.LoadImages(ctx, "nginx")
if err != nil {
b.Fatal(err)
}
}
})
} Benchmark executionRun benchmarks 5 times, including memory profile (bytes and allocations per operation):
Benchmarks results
ResultsWith the above numbers, it seems obvious that the Bytes per operation is way lower with the new method (from 460's to 270's). The other two values, ns and allocations per operation seems more or less the same: the code is not much faster nor produce less allocations, but it uses less memory. |
// In Go 1.22 os.File is always an io.WriterTo. However, testcontainers | ||
// currently allows Go 1.21, so we need to trick the compiler a little. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We test against both versions of the language, although we always develop in the lowest one. It could be probably interesting working the other way around: always develop in the latest release, and run the tests for both (current and current -1). Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, I think we should test at least 2 versions of Go anyway, and usually things like this don't come up too often.
In some projects I tend to do the develop in latest and test the floor model, just because a lot of devs always use latest first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Thanks for sharing the results @mdelapenya, and also merging! I'm not too surprised about the benchmarks at the moment as most of the potential is only implemented in linux right now. It could be neat to run the same in a linux container to see how much difference it makes where the go side is optimized, but not required on my side. Cheers! |
* main: (115 commits) chore: create TLS certs in a consistent manner (testcontainers#2478) chore(deps): bump idna from 3.6 to 3.7 (testcontainers#2480) Elasticsearch disable CA retrieval when ssl is disabled (testcontainers#2475) fix: handle dockerignore exclusions properly (testcontainers#2476) chore: prepare for next minor development cycle (0.31.0) chore: use new version (v0.30.0) in modules and examples Fix url creation to handle query params when using HTTP wait strategy (testcontainers#2466) fix: data race on container run (testcontainers#2345) fix: logging deadlock (testcontainers#2346) feat(k6):Add remote test scripts (testcontainers#2350) feat: optimizes file copies to and from containers (testcontainers#2450) fix(exec): updates the `Multiplexed` opt to combine stdout and stderr (testcontainers#2452) Upgrade neo4j module to use features from v0.29.1 of testcontainers-go (testcontainers#2463) bug:Fix AMQPS url (testcontainers#2462) chore: more compose updates in comments chore: use "docker compose" (v2) instead of "docker-compose" (v1) (testcontainers#2464) chore(deps): bump github/codeql-action from 2.22.12 to 3.24.9 (testcontainers#2459) refactor: Add Weaviate modules tests (testcontainers#2447) feat(exitcode): Add exit code sugar method (testcontainers#2342) feat: add module to support InfluxDB v1.x (testcontainers#1703) ...
What does this PR do?
This changes code interacting with file copies to and from the container to use optimized functions. Doing so reduces buffering and gives a chance for to use Go 1.22's optimized paths for linux.
Why is it important?
I'm using rather large images in k3s. For example, nodejs images get easily over a GB each. I found this copy logic accounts for the majority of test fixture setup, in our case sometimes over a minute is spent here even when images are available locally.
Related issues
Originally added in #347
How to test this PR
you can use k3s and its
LoadImages
function which will hit all the paths here.