Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak when setting options.timeout per subgraph in transportEntries #303

Closed
jaffemd opened this issue Dec 9, 2024 · 12 comments · Fixed by #598
Closed

Memory Leak when setting options.timeout per subgraph in transportEntries #303

jaffemd opened this issue Dec 9, 2024 · 12 comments · Fixed by #598
Assignees
Labels
bug Something isn't working performance

Comments

@jaffemd
Copy link

jaffemd commented Dec 9, 2024

Versions:

node: 22.10.0
"@graphql-hive/gateway": "1.6.6"
"@graphql-hive/gateway-runtime": "1.3.13"

Hi! We currently have a hive gateway setup to serve a supergraph.

We recently tried adding timeout settings per subgraph. Below is our config:

const allowedHeaders = [
  //...
];

const subgraph1Location = process.env.SUBGRAPH_1_LOCATION;
const subgraph1Timeout = process.env.SUBGRAPH_1_TIMEOUT;

const subgraph2Location = process.env.SUBGRAPH_2_LOCATION;
const subgraph2Timeout = process.env.SUBGRAPH_2_TIMEOUT;

export const gatewayConfig = defineConfig({
  supergraph: "supergraph.graphql",
  port: parseInt(process.env.PORT || "8000"),
  cors: {
    origin: "*",
    methods: ["GET", "POST", "PUT", "DELETE", "OPTIONS"],
    allowedHeaders,
  },
  landingPage: false,
  executionCancellation: true,
  upstreamCancellation: true,
  disableIntrospection: true,
  batching: false,
  pollingInterval: 31_556_952_000,
  maskedErrors: false,
  graphiql: false,

  transportEntries: {
    'subgraph1': {
      location: subgraph1Location,
      options: {
        timeout: subgraph1Timeout,
      },
    },
    'subgraph2': {
      location: subgraph2Location,
      options: {
        timeout: subgraph2Timeout,
      },
    },
  }
});

After a couple of days, we observed a memory leak and traced it to this code change on our side.

Screenshot 2024-12-09 at 11 04 50 AM

After trying several things, the single change that fixed our memory leak was removing the options.timeout setting in transportEntries. We moved the per subgraph timeout to a single setting on the newly added requestTimeout option, and the memory leak was immediately fixed.

  requestTimeout: process.env.REQUEST_TIMEOUT,
  
  transportEntries: {
    'subgraph1': {
      location: subgraph1Location,
    },
    'subgraph2': {
      location: subgraph2Location,
    },
@jaffemd jaffemd changed the title Memory Leak caused by setting options.timeout in transportEntries Memory Leak when setting options.timeout in transportEntries Dec 9, 2024
@jaffemd jaffemd changed the title Memory Leak when setting options.timeout in transportEntries Memory Leak when setting options.timeout per subgraph in transportEntries Dec 9, 2024
@jaffemd
Copy link
Author

jaffemd commented Jan 2, 2025

Hi! I saw the new upstreamTimeout feature you just implemented and it's exactly what we're looking for! Unfortunately, we're seeing the same memory leak issue with this as we were seeing with our previous implementation. Below is the our memory usage graph before and after turning on this feature.

Screenshot 2025-01-02 at 2 27 23 PM

Here's our implementation:

const UPSTREAM_TIMEOUTS: Record<string, number> = {
  //[subgraph1]: 2000
  //...
}

 upstreamTimeout: ({ subgraphName }: TimeoutFactoryPayload): number | undefined {
    if (!subgraphName) { return requestTimeout; }

    return UPSTREAM_TIMEOUTS[subgraphName] || 10000;
  },

@ardatan
Copy link
Member

ardatan commented Jan 3, 2025

Thanks for creating the issue and the feedback!
Could you test with the alphas here?
#407 (comment)

@jaffemd
Copy link
Author

jaffemd commented Jan 6, 2025

@ardatan Thank you for the proposed fix! However, we just tested the alpha release for a couple hours and noticed that it didn't help, and actually might have increased the rate of leaking memory.

Screenshot 2025-01-06 at 12 34 34 PM

@enisdenjo
Copy link
Member

Can you try the latest LTS Node (v22.12.0)? Just to make sure it's not a Node quirk (which we've had happen before 😅).

@jaffemd
Copy link
Author

jaffemd commented Jan 7, 2025

@enisdenjo Thanks for the suggestion, but we're already using node v22.12.0.

@enisdenjo
Copy link
Member

Ah ok cool, because I see v22.10.0 in the issue description.

@jaffemd
Copy link
Author

jaffemd commented Jan 7, 2025

Ah, we must've upgraded to 22.12.0 since I initially logged the issue. But I did verify that we're currently on 22.12.0

@dotansimha
Copy link
Member

Hi @jaffemd, is there any chance we can get your help with reproducing this issue somehow? We've tried to reproduce it, but it does not seem to leak.

@dotansimha
Copy link
Member

Also, @jaffemd , is it possible for you to share a memory snapshot?

@dotansimha
Copy link
Member

Hi @jaffemd , did you had the chance to review my last comments? A reproduction or additional details would really help us to provide support on this issue.

@jaffemd
Copy link
Author

jaffemd commented Jan 30, 2025

Hi @dotansimha, thank you for the responses! Unfortunately, it's not easy for us to get a memory snapshot or share any more reproduction details than the config we have. We do have several custom plugins that we've implemented, but we've tried turning those off and we still observed the memory leak as long as any of the various settings that have set an upstreamTimeout on subgraph requests. Here are our config files:

/// gateway.config.ts
// "@graphql-hive/gateway": "1.8.0"

import * as path from 'path';
import { defineConfig, GatewayContext } from "@graphql-hive/gateway";

export const allowedHeaders = [
  // ...
];

const methods = ["GET", "POST", "PUT", "DELETE", "OPTIONS"];
const oneYearInMs = 31_556_952_000;

export const gatewayConfig = defineConfig({
  supergraph: path.join("supergraph.graphql"),
  port: parseInt(process.env.PORT),
  cors: {
    origin: "*",
    methods,
    allowedHeaders,
  },
  plugins: () => [
    //...plugins
  ],
  landingPage: false,
  healthCheckEndpoint: "healthz",
  readinessCheckEndpoint: "healthz",

  executionCancellation: false,

  upstreamCancellation: false,

  disableIntrospection: process.env.DISABLE_INTROSPECTION,

  batching: false,

  pollingInterval: oneYearInMs,

  maskedErrors: false,

  graphiql: true,

  requestTimeout: process.env.REQUEST_TIMEOUT,

  upstreamTimeout: process.env.UPSTREAM_TIMEOUT,

  transportEntries: {
    'subgraph1': {
      location: process.env.SUBGRAPH_1_HOST,
    },
    'subgraph2': {
      location: process.env.SUBGRAPH_2_HOST,
    },
  }
});

/// mesh.config.ts
// "@graphql-mesh/compose-cli": "1.2.11"


import {
  loadGraphQLHTTPSubgraph,
  defineConfig,
} from "@graphql-mesh/compose-cli";

export const operationHeaders = {
  "Accept": "{context.headers['accept']}",
  "Accept-Language": "{context.headers['accept-language']}",
  "Accept-Version": "{context.headers['accept-version']}",
  "Authorization": "{context.headers['authorization']}",
  "X-Operation-Name": "{context.params.operationName}",
  // ...others
};

export const composeConfig = defineConfig({
  output: schemaRoot("supergraph.graphql"),
  subgraphs: [
    {
      sourceHandler: loadGraphQLHTTPSubgraph("subgraph1", {
        endpoint: SUBGRAPH_1_HOST,
        source: schemaRoot("schemas/subgraph1/schema.graphql"),
        operationHeaders,
      }),
    },
    {
      sourceHandler: loadGraphQLHTTPSubgraph("subgraph2", {
        endpoint: SUBGRAPH_2_HOST,
        source: schemaRoot("schemas/subgraph2/schema.graphql"),
        operationHeaders,
      }),
    },
  ],
});

function schemaRoot(schemaPath: string): string {
  return path.join(schemaPath);
}

@dotansimha dotansimha assigned enisdenjo and unassigned ardatan Feb 4, 2025
@dotansimha dotansimha added the bug Something isn't working label Feb 4, 2025 — with Linear
@enisdenjo
Copy link
Member

Potential fix in #593, could you be so kind @jaffemd and test the alphas from the PR to confirm the fix before merging? Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment