Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

healthcheck: make sure chain backend has enough outbound peers #8576

Conversation

mohamedawnallah
Copy link
Contributor

Change Description

Closes #8487

Steps to Test

Steps for reviewers to follow to test the change.

Pull Request Checklist

Testing

  • Your PR passes all CI checks.
  • Tests covering the positive and negative (error paths) are included.
  • Bug fixes contain tests triggering the bug to prevent regressions.

Code Style and Documentation

📝 Please see our Contribution Guidelines for further guidance.

Copy link
Contributor

coderabbitai bot commented Mar 22, 2024

Walkthrough

The update introduces a new mechanism to ensure a healthy connection to the network for lnd by monitoring the number of outbound peers. A constant for the minimum outbound peers is established, and a new function, checkOutboundPeers, is added to log warnings or take action if the number falls below certain thresholds. This enhancement aims to maintain robust network connectivity for bitcoind and btcd backends.

Changes

File Path Change Summary
chainreg/chainregistry.go - Added DefaultMinOutboundPeers
- Added checkOutboundPeers
- Modified NewPartialChainControl
docs/release-notes/release-notes-0.18.0.md - Documented the addition of checkOutboundPeers in chainHealthCheck

Assessment against linked issues

Objective Addressed Explanation
Ensure bitcoind maintains a healthy connection to the network (#8487)
Log warnings if outbound peers are below 6 (#8487)
Consider stopping lnd if outbound peers are below 2 (#8487) The changes include logging warnings but do not explicitly mention stopping lnd.

Possibly related issues

  • lightningnetwork/lnd#6397: While the changes focus on network health via outbound peers, they indirectly support overall network stability, which is crucial for watchtower functionality. However, the specific watchtower sequence number and removal issues are not directly addressed.

🐇
In the world of code, where bits roam free,
A rabbit hopped, fixing lnd with glee.
"Check your peers," it whispered, soft and sweet,
Ensuring connections never faced defeat.
With each hop and bound, it made things right,
Under the moon's soft, glowing light.
🌙✨🐾


Recent Review Status

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 6377f98 and 130fdbd.
Files selected for processing (2)
  • chainreg/chainregistry.go (4 hunks)
  • docs/release-notes/release-notes-0.18.0.md (1 hunks)
Additional comments not posted (4)
docs/release-notes/release-notes-0.18.0.md (1)

268-271: Consider clarifying the action taken when the number of outbound peers is below the critical threshold.

The description of the checkOutboundPeers function mentions logging warnings if the number of outbound peers falls below 6. It would be beneficial to explicitly state the actions taken when the count drops below the critical threshold of 2, as mentioned in the PR objectives. This clarification will help users understand the potential impact on lnd operation under such conditions.

chainreg/chainregistry.go (3)

126-129: Introducing DefaultMinOutboundPeers to enforce a minimum number of outbound peers is a proactive approach to maintaining network health.


512-526: The health check logic for bitcoind includes a condition to skip the outbound peer check on local test networks (SimNet or RegTest). This is a thoughtful addition to avoid unnecessary warnings during development or testing.


635-649: Similarly, the health check logic for btcd also wisely skips the outbound peer check on local test networks. Consistency in handling test environments across different backends is crucial for a seamless developer experience.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@mohamedawnallah mohamedawnallah force-pushed the check-outbound-peers-chain-backend branch from 4c0600e to b3cbafd Compare March 22, 2024 18:43
Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I think this can be done quite similarly to how the PartialChainControl.HealthCheck function is implemented, requiring no changes to the healthcheck package.

healthcheck/go.mod Outdated Show resolved Hide resolved
server.go Outdated Show resolved Hide resolved
healthcheck/peers.go Outdated Show resolved Hide resolved
@mohamedawnallah mohamedawnallah force-pushed the check-outbound-peers-chain-backend branch 3 times, most recently from c13991f to e118b05 Compare March 31, 2024 06:07
@mohamedawnallah mohamedawnallah requested a review from guggero March 31, 2024 11:47
chainreg/chainregistry.go Outdated Show resolved Hide resolved
@mohamedawnallah mohamedawnallah force-pushed the check-outbound-peers-chain-backend branch 2 times, most recently from 6d57a6a to 2a5f09e Compare April 2, 2024 13:40
@mohamedawnallah mohamedawnallah requested a review from guggero April 2, 2024 16:29
Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good now, thanks for the update!
Have a suggestion for the logged message, other than that looks good to me.

chainreg/chainregistry.go Outdated Show resolved Hide resolved
@mohamedawnallah mohamedawnallah force-pushed the check-outbound-peers-chain-backend branch from 2a5f09e to 2628df2 Compare April 2, 2024 17:08
@mohamedawnallah mohamedawnallah requested a review from guggero April 2, 2024 20:55
Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last request from manual tests, other than that looks good, thanks.

chainreg/chainregistry.go Show resolved Hide resolved
Copy link
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Agreed with @guggero re reducing logs for local networks 👍

Left some minor nits too

Comment on lines 514 to 519
err = checkOutboundPeers(chainConn)
if err != nil {
return err
}

return nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can just return checkOutboundPeers(chainConn)

Comment on lines 635 to 640
err = checkOutboundPeers(chainRPC.Client)
if err != nil {
return err
}

return nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return checkOutboundPeers(chainConn)

}
}

if outboundPeers < 6 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps make the magic number a constant?

@@ -263,6 +263,11 @@ bitcoin peers' feefilter values into account](https://github.com/lightningnetwor
types](https://github.com/lightningnetwork/lnd/pull/8554) defined in
`btcd/rpcclient`.

`[checkOutboundPeers](https://github.com/lightningnetwork/lnd/pull/8576) is
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing a *

@@ -263,6 +263,11 @@ bitcoin peers' feefilter values into account](https://github.com/lightningnetwor
types](https://github.com/lightningnetwork/lnd/pull/8554) defined in
`btcd/rpcclient`.

`[checkOutboundPeers](https://github.com/lightningnetwork/lnd/pull/8576) is
added to `chainHealthCheck` to make sure chain backend `bitcoind` and `btcd`
maintains a healthy connection to the network by checking the number of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/maintains/maintain since you are talking about multiple backends

@mohamedawnallah mohamedawnallah force-pushed the check-outbound-peers-chain-backend branch 2 times, most recently from e9016b5 to 37fc790 Compare April 5, 2024 06:31
@guggero guggero self-requested a review April 5, 2024 07:00
Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, LGTM 🎉

Comment on lines 868 to 873
// On local test networks we usually don't have multiple
// chain backend peers, so we can skip that test.
if cfg.Bitcoin.SimNet || cfg.Bitcoin.RegTest {
return nil
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one last nit:

I think we should put this check in the HealthCheck call back func instead of adding it here. That way when I read the call to checkOutboundPeers then can just assume that that is what it will do & dont need to go into the method itself to go see that it will only actually do that if the network is mainnet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the same when I put this check inside the checkOutboundPeers function. Thanks, good point!

In this commit we add `checkOutboundPeers` function
to the `cc.HealthCheck` function.
@mohamedawnallah mohamedawnallah force-pushed the check-outbound-peers-chain-backend branch from 37fc790 to 130fdbd Compare April 9, 2024 17:20
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Comment on lines +877 to +903
// checkOutboundPeers checks the number of outbound peers connected to the
// provided RPC client. If the number of outbound peers is below 6, a warning
// is logged. This function is intended to ensure that the chain backend
// maintains a healthy connection to the network.
func checkOutboundPeers(client *rpcclient.Client) error {
peers, err := client.GetPeerInfo()
if err != nil {
return err
}

var outboundPeers int
for _, peer := range peers {
if !peer.Inbound {
outboundPeers++
}
}

if outboundPeers < DefaultMinOutboundPeers {
log.Warnf("The chain backend has an insufficient number "+
"of connected outbound peers (%d connected, expected "+
"minimum is %d) which can be a security issue. "+
"Connect to more trusted nodes manually if necessary.",
outboundPeers, DefaultMinOutboundPeers)
}

return nil
}
Copy link
Contributor

@coderabbitai coderabbitai bot Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checkOutboundPeers function effectively logs a warning if the number of outbound peers is below the defined threshold. This function is well-implemented, but consider enhancing it by returning a specific error when the number of outbound peers is critically low, allowing for more decisive action than just logging.

if outboundPeers < criticalThreshold {
-   log.Warnf("...")
+   return fmt.Errorf("critical low number of outbound peers: %d", outboundPeers)
}

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
// checkOutboundPeers checks the number of outbound peers connected to the
// provided RPC client. If the number of outbound peers is below 6, a warning
// is logged. This function is intended to ensure that the chain backend
// maintains a healthy connection to the network.
func checkOutboundPeers(client *rpcclient.Client) error {
peers, err := client.GetPeerInfo()
if err != nil {
return err
}
var outboundPeers int
for _, peer := range peers {
if !peer.Inbound {
outboundPeers++
}
}
if outboundPeers < DefaultMinOutboundPeers {
log.Warnf("The chain backend has an insufficient number "+
"of connected outbound peers (%d connected, expected "+
"minimum is %d) which can be a security issue. "+
"Connect to more trusted nodes manually if necessary.",
outboundPeers, DefaultMinOutboundPeers)
}
return nil
}
// checkOutboundPeers checks the number of outbound peers connected to the
// provided RPC client. If the number of outbound peers is below 6, a warning
// is logged. This function is intended to ensure that the chain backend
// maintains a healthy connection to the network.
func checkOutboundPeers(client *rpcclient.Client) error {
peers, err := client.GetPeerInfo()
if err != nil {
return err
}
var outboundPeers int
for _, peer := range peers {
if !peer.Inbound {
outboundPeers++
}
}
if outboundPeers < DefaultMinOutboundPeers {
log.Warnf("The chain backend has an insufficient number "+
"of connected outbound peers (%d connected, expected "+
"minimum is %d) which can be a security issue. "+
"Connect to more trusted nodes manually if necessary.",
outboundPeers, DefaultMinOutboundPeers)
}
if outboundPeers < criticalThreshold {
return fmt.Errorf("critical low number of outbound peers: %d", outboundPeers)
}
return nil
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider enhancing it by returning a specific error when the number of outbound peers is critically low, allowing for more decisive action than just logging.

I don't think this is what we want since bitcoind should recover from this automatically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mohamedawnallah, thank you for your input. It's a valid point that bitcoind is designed to manage its peer connections autonomously and should be capable of recovering from a low number of outbound peers on its own. Given this, it makes sense to rely on bitcoind's built-in mechanisms for maintaining a healthy connection to the network without introducing additional complexity into lnd. Your decision to keep the current implementation as is, focusing on logging for awareness while trusting in bitcoind's recovery capabilities, is well-understood.

Copy link
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🚀

@mohamedawnallah
Copy link
Contributor Author

@guggero Is it ready to be merged?

@guggero guggero merged commit 971281d into lightningnetwork:master Apr 10, 2024
26 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

healthcheck: make sure chain backend has enough outbound peers
3 participants