Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak? #1456

Closed
mohsalsaleem opened this issue Dec 29, 2022 · 6 comments · Fixed by #1614
Closed

Possible memory leak? #1456

mohsalsaleem opened this issue Dec 29, 2022 · 6 comments · Fixed by #1614

Comments

@mohsalsaleem
Copy link

pgx version: v5.2.0
PostgreSQL version: 13.9

I have an application where I repeatedly query from a table in PostgreSQL. I noticed the amount of RAM being consumed by the application increasing linearly overtime.

Here is the pprof's result for top:

(pprof) top10
Showing nodes accounting for 39287.54kB, 100% of 39287.54kB total
Showing top 10 nodes out of 46
      flat  flat%   sum%        cum   cum%
32352.25kB 82.35% 82.35% 32352.25kB 82.35%  github.com/jackc/pgx/v5/pgconn.(*PgConn).convertRowDescription
 1537.50kB  3.91% 86.26%  1537.50kB  3.91%  runtime.allocm
 1280.36kB  3.26% 89.52% 33632.60kB 85.61%  github.com/jackc/pgx/v5.(*Conn).Prepare
 1024.41kB  2.61% 92.13%  1024.41kB  2.61%  runtime.malg
  532.26kB  1.35% 93.48%   532.26kB  1.35%  github.com/jackc/pgx/v5/pgtype.(*Map).RegisterDefaultPgType
  512.62kB  1.30% 94.79%   512.62kB  1.30%  regexp/syntax.(*compiler).inst (inline)
  512.07kB  1.30% 96.09%  1024.70kB  2.61%  github.com/aws/aws-sdk-go/aws/endpoints.init
  512.05kB  1.30% 97.39%   512.05kB  1.30%  context.(*cancelCtx).Done
  512.02kB  1.30% 98.70%   512.02kB  1.30%  runtime.gcBgMarkWorker
  512.01kB  1.30%   100%   512.01kB  1.30%  github.com/jackc/pgx/v5/internal/stmtcache.NextStatementName

Here is the png of the heap
profile003

The number of records queried every iteration is around 20-25 and a record contains around 100 columns.

Here is the pprof's result of top after 30 minutes from the previous result:

Showing nodes accounting for 56.92MB, 98.27% of 57.92MB total
Showing top 10 nodes out of 54
      flat  flat%   sum%        cum   cum%
   50.65MB 87.44% 87.44%    50.65MB 87.44%  github.com/jackc/pgx/v5/pgconn.(*PgConn).convertRowDescription
    1.50MB  2.59% 90.03%     1.50MB  2.59%  runtime.allocm
       1MB  1.73% 91.76%        1MB  1.73%  runtime.malg
    0.75MB  1.30% 93.06%    51.40MB 88.74%  github.com/jackc/pgx/v5.(*Conn).Prepare
    0.52MB   0.9% 93.95%     0.52MB   0.9%  github.com/jackc/pgx/v5/pgtype.(*Map).RegisterDefaultPgType
    0.50MB  0.87% 94.82%     0.50MB  0.87%  database/sql.driverArgsConnLocked
    0.50MB  0.86% 95.68%     0.50MB  0.86%  regexp/syntax.(*compiler).inst (inline)
    0.50MB  0.86% 96.55%     0.50MB  0.86%  fmt.Sprintf
    0.50MB  0.86% 97.41%        1MB  1.73%  github.com/aws/aws-sdk-go/aws/endpoints.init
    0.50MB  0.86% 98.27%     0.50MB  0.86%  context.(*cancelCtx).Done

And the png of the heap
profile004

@mohsalsaleem
Copy link
Author

mohsalsaleem commented Dec 29, 2022

FYI, the below is the output of top command in pprof when I use pgx version: v4

Showing nodes accounting for 11647.39kB, 100% of 11647.39kB total
Showing top 10 nodes out of 54
      flat  flat%   sum%        cum   cum%
 3096.06kB 26.58% 26.58%  3096.06kB 26.58%  github.com/jackc/chunkreader/v2.(*ChunkReader).newBuf
 2562.50kB 22.00% 48.58%  2562.50kB 22.00%  runtime.allocm
 2366.89kB 20.32% 68.90%  2366.89kB 20.32%  github.com/denisenkom/go-mssqldb/internal/cp.init
 1028.76kB  8.83% 77.74%  1028.76kB  8.83%  github.com/jackc/pgconn.(*PgConn).Prepare
 1024.41kB  8.80% 86.53%  1024.41kB  8.80%  runtime.malg
  544.67kB  4.68% 91.21%   544.67kB  4.68%  github.com/denisenkom/go-mssqldb.newTdsBuffer (inline)
  512.06kB  4.40% 95.60%   512.06kB  4.40%  regexp/syntax.(*parser).maybeConcat
  512.05kB  4.40%   100%   512.05kB  4.40%  runtime.acquireSudog
         0     0%   100%   544.67kB  4.68%  database/sql.(*DB).PingContext
         0     0%   100%   544.67kB  4.68%  database/sql.(*DB).conn

And the png output is

profile007

With pgx v4, the memory consumption seems almost constant.

@kamikazechaser
Copy link

Could this (#845 (comment)) be somehow related?

@mohsalsaleem
Copy link
Author

Maybe, but this is v5, which has some major re-writes.

@rsaphala
Copy link

any updates on this?

@jackc
Copy link
Owner

jackc commented Jan 24, 2023

Not really enough to go on. Row descriptions are handled differently in v5. Column names are converted to strings instead of potentially pinning the entire read buffer to memory. I could see that having different memory usage patterns but nothing that should be leaking. Though for that matter, those pprof results indicate allocations, but don't necessarily indicate a leak. It might just be the difference from allocating 100 strings a query for the column names.

@evanj
Copy link
Contributor

evanj commented May 16, 2023

I suspect I've run into a version of this bug: cached statements are never removed from the Conn.preparedStatements map. If your workload executes unique queries, it will use infinite memory.

I believe the problem is the deallocateInvalidatedCachedStatements function needs to call delete(c.preparedStatements, sd.Name) but does not. I have a one line fix for this, but I'd like to see if I can figure out a test to make sure it stays fixed. See: https://github.com/jackc/pgx/blob/master/conn.go#L1327-L1329

Example program that reproduces the bug by generating unique queries:

package main

import (
  "context"
  "fmt"
  "os"
  "runtime"
  "runtime/pprof"
  "strings"

  "github.com/evanj/hacks/postgrestest"
  "github.com/jackc/pgx/v5"
)

func main() {
  instance, err := postgrestest.NewInstance()
  if err != nil {
    panic(err)
  }
  defer instance.Close()

  ctx := context.Background()
  conn, err := pgx.Connect(ctx, instance.URL())
  if err != nil {
    panic(err)
  }
  defer conn.Close(ctx)

  const commentLength = 256 * 1024
  bigComment := "\n/*"
  for len(bigComment) < commentLength {
    bigComment += fmt.Sprintf("\nlen %d abcdefghijklmnopqrstuvwxyz", len(bigComment))
  }
  bigComment += "*/"

  for i := int64(0); i < 100000; i++ {
    query := fmt.Sprintf("select 'unique str %d'", i) + bigComment

    var outputStr string
    err = conn.QueryRow(ctx, query).Scan(&outputStr)
    if err != nil {
      panic(err)
    }
    if !strings.HasPrefix(outputStr, "unique str ") {
      panic(outputStr)
    }

    if i%5000 == 0 {
      var stats runtime.MemStats
      runtime.ReadMemStats(&stats)
      fmt.Printf("iteration=%d num_gc=%d heap_alloc=%d heap_objects=%d heap_sys=%d\n",
        i, stats.NumGC, stats.HeapAlloc, stats.HeapObjects, stats.HeapSys)
    }
    if i == 5000 {
      f, err := os.Create("heap.pprof")
      if err != nil {
        panic(err)
      }
      err = pprof.WriteHeapProfile(f)
      if err != nil {
        panic(err)
      }
      err = f.Close()
      if err != nil {
        panic(err)
      }
      fmt.Println("   wrote heap profile")
    }
  }
}

jackc pushed a commit that referenced this issue May 20, 2023
Previously, items were never removed from the preparedStatements map.
This means workloads that send a large number of unique queries could
run out of memory. Delete items from the map when sending the
deallocate command to Postgres. Add a test to verify this works.

Fixes #1456
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants