Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gateways stall after 5 minutes when using reframe/delegated routing #9445

Closed
Tracked by #9417
ajnavarro opened this issue Dec 2, 2022 · 3 comments · Fixed by #9447
Closed
Tracked by #9417

Gateways stall after 5 minutes when using reframe/delegated routing #9445

ajnavarro opened this issue Dec 2, 2022 · 3 comments · Fixed by #9447
Assignees
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@ajnavarro
Copy link
Member

ajnavarro commented Dec 2, 2022

Version

v0.17.0

Config

"Routing": {
    "Methods": {
      "find-peers": {
        "RouterName": "DHTs"
      },
      "find-providers": {
        "RouterName": "ParallelHelper"
      },
      "get-ipns": {
        "RouterName": "DHTs"
      },
      "provide": {
        "RouterName": "DHTs"
      },
      "put-ipns": {
        "RouterName": "DHTs"
      }
    },
    "Routers": {
      "CidContact": {
        "Parameters": {
          "Endpoint": "https://cid.contact/reframe"
        },
        "Type": "reframe"
      },
      "ParallelHelper": {
        "Parameters": {
          "Routers": [
            {
              "IgnoreErrors": true,
              "RouterName": "CidContact",
              "Timeout": "3s"
            },
            {
              "IgnoreErrors": false,
              "RouterName": "DHTs",
              "Timeout": "5m"
            }
          ]
        },
        "Type": "parallel"
      },
      "DHTs": {
        "Parameters": {
          "Routers": [
            {
              "IgnoreErrors": false,
              "RouterName": "WanDHT",
              "Timeout": "5m"
            },
            {
              "IgnoreErrors": false,
              "RouterName": "LanDHT",
              "Timeout": "1m"
            }
          ]
        },
        "Type": "parallel"
      },
      "WanDHT": {
        "Parameters": {
          "AcceleratedDHTClient": true,
          "Mode": "auto",
          "PublicIPNetwork": true
        },
        "Type": "dht"
      },
      "LanDHT": {
        "Parameters": {
          "AcceleratedDHTClient": false,
          "Mode": "auto",
          "PublicIPNetwork": false
        },
        "Type": "dht"
      }
    },

    "Type": "custom"
  },

Description

After ~5 minutes of having the following configuration, delegated routing stops sending find providers requests.

Might be a deadlock problem. A possible solution here: libp2p/go-libp2p-routing-helpers#64

@ajnavarro ajnavarro added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Dec 2, 2022
@ajnavarro ajnavarro self-assigned this Dec 2, 2022
@ajnavarro ajnavarro removed the need/triage Needs initial labeling and prioritization label Dec 2, 2022
@guseggert guseggert moved this to 🎉 Done in IPFS Shipyard Team Dec 2, 2022
@BigLep BigLep reopened this Dec 2, 2022
@BigLep BigLep closed this as completed Dec 2, 2022
@BigLep
Copy link
Contributor

BigLep commented Dec 2, 2022

@ajnavarro @guseggert : I'm reopening because:

  1. I don't see a regression test anywhere (looking at Return a closed channel instead of nil for getChannelOrErrorParallel libp2p/go-libp2p-routing-helpers#64 and chore: Update go-routing-helpers dependency #9447 ). How do we know this fixes the issue?
  2. Should we get chore: Add logs to composable parallel libp2p/go-libp2p-routing-helpers#63 merged as weel?

@BigLep BigLep reopened this Dec 2, 2022
@guseggert
Copy link
Contributor

I added a test here: libp2p/go-libp2p-routing-helpers#66

We should get the logging change merged too but I don't think it's necessary to unblock the gateway deployment

@BigLep BigLep mentioned this issue Dec 3, 2022
@guseggert
Copy link
Contributor

guseggert commented Dec 6, 2022

Fix has been merged & deployed, I verified that the deadlock is no longer occurring, and there's a regression test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants