Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the pg_isready command to asses whether PostgreSQL is ready or not #1093

Closed
wants to merge 4 commits into from

Conversation

0xced
Copy link
Contributor

@0xced 0xced commented Jan 22, 2024

What does this PR do?

This pull request uses the builtin pg_isready command instead of searching the logs for database system is ready to accept connections in order to determine whether the PostgreSQL database is ready or not.

Why is it important?

Under some circumstances (described in #1092), starting a PostgreSqlContainer might hang forever.

Related issues

Closes #1092

How to test this PR

A new test (Testcontainers.PostgreSql.PostgreSqlContainerTest.StopAndStartMultipleTimes) was added to demonstrates how the hang might happen and how using pg_isready instead addresses this issue.

Copy link

netlify bot commented Jan 22, 2024

Deploy Preview for testcontainers-dotnet ready!

Name Link
🔨 Latest commit 1f03c41
🔍 Latest deploy log https://app.netlify.com/sites/testcontainers-dotnet/deploys/65bfec0e54c4e10008cef473
😎 Deploy Preview https://deploy-preview-1093--testcontainers-dotnet.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

0xced added 2 commits January 28, 2024 10:39
This demonstrates that the wait strategy that counts the occurrences of "database system is ready to accept connections" in the logs does not work reliably.

This issue becomes more likely to happen with the recently introduced reuse feature.
.Concat(stdout.Split(LineEndings, StringSplitOptions.RemoveEmptyEntries))
.Concat(stderr.Split(LineEndings, StringSplitOptions.RemoveEmptyEntries))
.Count(line => line.Contains("database system is ready to accept connections")));
return ((PostgreSqlContainer)container).IsReadyAsync();
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please follow the pattern we are already using in other moduls (e.g. MySQL 1, 2)? I prefer a consistent approach; it is easier to navigate and refactor the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, I just addressed this in bf08a0e.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added .ConfigureAwait(false) and force-pushed so that's now commit cf0fdf1.

Comment on lines +48 to +58
public async Task StopAndStartMultipleTimes()
{
// Given
var timeoutSource = new CancellationTokenSource(TimeSpan.FromSeconds(60));

// When
var exception = await RestartAsync(timeoutSource.Token);

// Then
Assert.Null(exception);
}
Copy link
Collaborator

@HofmeisterAn HofmeisterAn Feb 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this will work for PostgreSQL, it won't work for other modules, specifically for those who use log messages to indicate readiness. I was thinking about a more general approach to utilize the existing module tests. What do you think about the following proposal? We can then update the module's tests over time.

namespace DotNet.Testcontainers.Commons;

public class SharedContainerInstance<TContainer> : IAsyncLifetime
    where TContainer : IContainer
{
    public SharedContainerInstance(TContainer container)
    {
        Container = container;
    }

    public TContainer Container { get; }

    public Task InitializeAsync()
    {
        return Task.CompletedTask;
    }

    public Task DisposeAsync()
    {
        return Container.DisposeAsync().AsTask();
    }
}
diff --git a/tests/Testcontainers.PostgreSql.Tests/PostgreSqlContainerTest.cs b/tests/Testcontainers.PostgreSql.Tests/PostgreSqlContainerTest.cs
index 5433e8c..a966ca9 100644
--- a/tests/Testcontainers.PostgreSql.Tests/PostgreSqlContainerTest.cs
+++ b/tests/Testcontainers.PostgreSql.Tests/PostgreSqlContainerTest.cs
@@ -1,17 +1,22 @@
 namespace Testcontainers.PostgreSql;
 
-public sealed class PostgreSqlContainerTest : IAsyncLifetime
+public sealed class PostgreSqlContainerTest : IClassFixture<PostgreSqlContainerTest.SharedContainerInstance>, IAsyncLifetime
 {
-    private readonly PostgreSqlContainer _postgreSqlContainer = new PostgreSqlBuilder().Build();
+    private readonly PostgreSqlContainerTest.SharedContainerInstance _sharedContainerInstance;
+
+    public PostgreSqlContainerTest(PostgreSqlContainerTest.SharedContainerInstance sharedContainerInstance)
+    {
+        _sharedContainerInstance = sharedContainerInstance;
+    }
 
     public Task InitializeAsync()
     {
-        return _postgreSqlContainer.StartAsync();
+        return _sharedContainerInstance.Container.StartAsync();
     }
 
     public Task DisposeAsync()
     {
-        return _postgreSqlContainer.DisposeAsync().AsTask();
+        return _sharedContainerInstance.Container.StopAsync();
     }
 
     [Fact]
@@ -19,7 +24,7 @@ public sealed class PostgreSqlContainerTest : IAsyncLifetime
     public void ConnectionStateReturnsOpen()
     {
         // Given
-        using DbConnection connection = new NpgsqlConnection(_postgreSqlContainer.GetConnectionString());
+        using DbConnection connection = new NpgsqlConnection(_sharedContainerInstance.Container.GetConnectionString());
 
         // When
         connection.Open();
@@ -36,10 +41,19 @@ public sealed class PostgreSqlContainerTest : IAsyncLifetime
         const string scriptContent = "SELECT 1;";
 
         // When
-        var execResult = await _postgreSqlContainer.ExecScriptAsync(scriptContent)
+        var execResult = await _sharedContainerInstance.Container.ExecScriptAsync(scriptContent)
             .ConfigureAwait(true);
 
         // When
         Assert.True(0L.Equals(execResult.ExitCode), execResult.Stderr);
     }
+
+    [UsedImplicitly]
+    public sealed class SharedContainerInstance : SharedContainerInstance<PostgreSqlContainer>
+    {
+        public SharedContainerInstance()
+            : base(new PostgreSqlBuilder().Build())
+        {
+        }
+    }
 }
\ No newline at end of file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this will work for PostgreSQL, it won't work for other modules, specifically for those who use log messages to indicate readiness.

I think we are not on the same page on this topic. The purpose of the StopAndStartMultipleTimes that was added for the PostgreSQL module was in my mind a way to prove that the log-based wait strategy was incorrect, in test driven development style.

  1. Run the StopAndStartMultipleTimes test
  2. ❌ Test does not pass (times out after one minute)
  3. Fix the wait strategy by using pg_isready instead of reading logs
  4. Run the StopAndStartMultipleTimes test again
  5. ✅ The test now passes indicating that the pg_isready strategy implementation is correct

If I understand your intention correctly, you'd like to introduce a way to start/stop containers before/after each tests. I think that could be a good addition but should probably be handled in a separate pull request.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand your intention. I think we are talking about the same thing; my approach ensures that the service running inside the container is ready too.

If the container indicates readiness after a stop, that does not necessarily mean that the service running inside the container is ready. Consider log message wait strategies, for example. If we simply check for "log message contains x", this may indicate readiness too early, because the message already exists from a previous start.

My intention is to ensure that not only the wait strategy passes but also that the service running inside the container is ready from the user's perspective, utilizing our existing tests.

If we stop and start the container in between, and the second test fails or hangs, we now know the wait strategy is broken. I agree; your suggestion is more explicit but lacks real readiness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we are indeed on the same page! And going a bit further to ensure that the service is actually running is a good idea. 👍

I see you already started working on it in the feature/set-created-started-stopped-container-timestamp-pass-wait-strategy branch. I was hoping to be able to use ContainerFixture from my Testcontainers.Xunit branch which serves the exact same purpose as SharedContainerInstance. But that branches depends on #1100.

I'm still convinced that a dedicated Testcontainers.Xunit NuGet package that could be used both from inside Testcontainers test projects and for consumer integration tests is a great idea. 😉

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping to be able to use ContainerFixture from my Testcontainers.Xunit branch which serves the exact same purpose as SharedContainerInstance. But that branches depends on #1100.

The PR does not contain the Xunit project, right? I think it is a good idea! A lot of developers will benefit from it. I have not taken a close look into the PR that involves the changes to the logging implementation yet. I took a quick look, and it is quite big. IIRC, I proposed to split it into two steps: one that provides an internal WithLogger(ILogger) builder method, and another that makes the necessary changes regarding logging the container runtime information and making the method public. After that, we can add the dedicated Xunit, package. WDYT? Splitting it makes reviewing much easier.

I'm still convinced that a dedicated Testcontainers.Xunit NuGet package that could be used both from inside Testcontainers test projects and for consumer integration tests is a great idea. 😉

BTW, did you notice the changes I made in the mentioned branch above to the PostgreSQL's wait strategy? Through the issue and the referred GitHub discussion, I noticed that PostgreSQL logs the interfaces to which it is listening. Maybe we do not need to invoke pg_isready.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll reply in the comments of #1100 for all the ILogger related stuff.

BTW, did you notice the changes I made in the mentioned branch above to the PostgreSQL's wait strategy? Through the issue and the referred GitHub discussion, I noticed that PostgreSQL logs the interfaces to which it is listening. Maybe we do not need to invoke pg_isready.

I've seen you made some changes about the strategy. I still think that leveraging pg_isready is the right thing to do. I never liked making decisions based on reading logs of another program. Those are just logs and — although unlikely — could change in the future. The whole purpose of the pg_isready command line tool is to test whether a PostgreSQL database is ready and that is pretty much guaranteed not to break in future versions. From an engineering standpoint it feels much better to me. Also, it's very close to the MySQL / MariaDb strategies.

Copy link
Collaborator

@HofmeisterAn HofmeisterAn Feb 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Logs are always a bit tricky, especially across different versions. As long as pg_isready is reliable, I am fine. Did you check other PostgreSQL images (versions) as well (I guess they contain pg_isready too)? Let's merge develop and complete this pull request (after #1110 if you do not mind).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about older versions. The pg_isready command was introduced in PostgreSQL 9.3 so I will revisit this PR once #1110 is merged. I will test with PostgreSQL 8 (which is the oldest image available on Docker Hub) and see what happens.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if versions 14, 15, and 16 support it, it is good enough. Developers can still override the wait strategy or use the generic builder if an older version is really necessary.

@0xced
Copy link
Contributor Author

0xced commented Feb 10, 2024

Superseded by #1111.

@0xced 0xced closed this Feb 10, 2024
@0xced 0xced deleted the PostgresIsReady branch February 10, 2024 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: The PostgreSqlContainer wait strategy is not reliable (non default configuration)
2 participants