Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: replace DHT with private peer discovery #3041

Merged
merged 25 commits into from
Nov 6, 2024
Merged

Conversation

brewmaster012
Copy link
Collaborator

@brewmaster012 brewmaster012 commented Oct 25, 2024

New 8123 telemetry methods:

router.Handle("/connectedpeers", http.HandlerFunc(t.connectedPeersHandler)).Methods(http.MethodGet)
router.Handle("/pingrtt", http.HandlerFunc(t.pingRTTHandler)).Methods(http.MethodGet)

/connectedpeers returns a list of connected peers in p2p network (with connection gater, should really contain the tss signers, but we'll see.)
/pingrtt returns the most recent of map tss signers => ping data--RTT in nanoseconds using libp2p Ping protocol--if no response RTT=-1

This PR address the performance e2e test failure introduced by removing the DHT peer discovery service.

The actual change is in go-tss PR

Description

How Has This Been Tested?

  • Tested CCTX in localnet
  • Tested in development environment
  • Go unit tests
  • Go integration tests
  • Tested via GitHub Actions

Summary by CodeRabbit

  • New Features

    • Introduced whitelisted peers functionality for enhanced TSS server setup.
    • Added periodic logging of connected peers for improved observability.
  • Bug Fixes

    • Enhanced error logging for key signing failures, providing clearer insights into issues.
  • Chores

    • Updated and restructured dependencies to ensure the use of the latest library versions.

Copy link
Contributor

coderabbitai bot commented Oct 25, 2024

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough

Walkthrough

The pull request introduces modifications primarily to the start.go, go.mod, and tss_signer.go files. Key changes include the addition of a whitelistedPeers parameter to the SetupTSSServer function, enhancing peer management during TSS server setup. The go.mod file reflects updates to dependencies, including the addition of the go-libp2p-core package and updates to existing ones. Additionally, error logging has been improved in the signing methods, providing better insights into failures. These changes collectively enhance functionality and maintainability within the TSS server setup process.

Changes

File Path Change Summary
cmd/zetaclientd/start.go Added imports for spew, peer, and conversion. Updated start function to handle whitelisted peers and log connected peers. Modified SetupTSSServer call to include whitelistedPeers.
go.mod Updated dependencies: added github.com/libp2p/go-libp2p-core, changed github.com/mr-tron/base58 to direct, and updated gitlab.com/thorchain/tss/go-tss version.
zetaclient/tss/tss_signer.go Updated SetupTSSServer function signature to include whitelistedPeers. Improved error logging in Sign and SignBatch methods for key signing failures.

Possibly related PRs

Suggested labels

no-changelog, SOLANA_TESTS

Suggested reviewers

  • fbac
  • kingpinXD
  • skosito
  • lumtis
  • swift1337
  • ws4charlie

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented Oct 25, 2024

Codecov Report

Attention: Patch coverage is 0% with 47 lines in your changes missing coverage. Please review.

Project coverage is 63.30%. Comparing base (9cf3635) to head (e5b77df).
Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
zetaclient/metrics/telemetry.go 0.00% 47 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3041      +/-   ##
===========================================
- Coverage    63.38%   63.30%   -0.09%     
===========================================
  Files          422      422              
  Lines        29915    29954      +39     
===========================================
  Hits         18963    18963              
- Misses       10111    10150      +39     
  Partials       841      841              
Files with missing lines Coverage Δ
zetaclient/metrics/metrics.go 75.00% <ø> (ø)
zetaclient/tss/tss_signer.go 21.42% <ø> (+0.04%) ⬆️
zetaclient/metrics/telemetry.go 0.00% <0.00%> (ø)

@brewmaster012 brewmaster012 marked this pull request as ready for review October 25, 2024 02:51
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (3)
zetaclient/tss/tss_signer.go (3)

Line range hint 246-267: Enhance error handling in Sign and SignBatch methods.

Both methods have similar error handling patterns that could be improved:

  1. Error messages contain typos ("signuature")
  2. Some error conditions lack specific error types

Apply these improvements:

-return [65]byte{}, fmt.Errorf("signuature verification fail")
+return [65]byte{}, fmt.Errorf("signature verification failed: %w", err)

Consider creating custom error types for better error handling:

var (
    ErrSignatureVerification = errors.New("signature verification failed")
    ErrSignatureDecoding    = errors.New("signature decoding failed")
)

Also applies to: 349-370


Line range hint 258-267: Refactor signature verification logic.

The signature verification code is duplicated between Sign and SignBatch methods.

Extract the common verification logic into a helper function:

func verifyAndDecodeSig(signature keysign.Signature, digest []byte, pubkey []byte) ([65]byte, error) {
    var sigBytes [65]byte
    if err := decodeSignatureParts(&sigBytes, signature); err != nil {
        return [65]byte{}, fmt.Errorf("failed to decode signature: %w", err)
    }
    if err := verifySignatureBytes(sigBytes, digest, pubkey); err != nil {
        return [65]byte{}, fmt.Errorf("failed to verify signature: %w", err)
    }
    return sigBytes, nil
}

Also applies to: 361-370


Line range hint 233-242: Consider implementing retry mechanism for failed signatures.

Both Sign and SignBatch methods post blame data but don't attempt to retry the signing operation.

Consider implementing a retry mechanism with exponential backoff for transient failures:

  1. Add retry configuration parameters
  2. Implement exponential backoff
  3. Only retry for specific error conditions
  4. Add metrics for retry attempts

Would you like me to provide a detailed implementation of this retry mechanism?

Also applies to: 336-345

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between b334267 and 84a4b29.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • cmd/zetaclientd/start.go (3 hunks)
  • go.mod (3 hunks)
  • zetaclient/tss/tss_signer.go (2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
cmd/zetaclientd/start.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

zetaclient/tss/tss_signer.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

🪛 GitHub Check: codecov/patch
zetaclient/tss/tss_signer.go

[warning] 189-189: zetaclient/tss/tss_signer.go#L189
Added line #L189 was not covered by tests

🔇 Additional comments (4)
cmd/zetaclientd/start.go (1)

220-220: LGTM: TSS server setup with whitelisted peers.

The implementation correctly integrates the whitelisted peers with the TSS server setup, aligning with the PR objective to replace DHT with private peer discovery.

go.mod (3)

244-244: LGTM: Direct dependency on base58 is appropriate.

The change from indirect to direct dependency on github.com/mr-tron/base58 is correct as it's now explicitly used in the codebase.


371-372: LGTM: Updated go-tss dependency.

The replacement of gitlab.com/thorchain/tss/go-tss with a specific commit from the ZetaChain fork aligns with the PR objective of implementing private peer discovery.


337-337: Verify compatibility with go-libp2p.

The addition of github.com/libp2p/go-libp2p-core v0.20.1 might need verification as this version could be outdated relative to the replaced go-libp2p dependency.

Consider evaluating if this dependency can be replaced with the newer github.com/libp2p/go-libp2p interfaces, as go-libp2p-core might be deprecated in future releases.

✅ Verification successful

Let me gather more information about the libp2p dependencies to verify compatibility.


Let me check the actual usage of these dependencies in the code to make a more informed decision.


Dependency version is compatible with current implementation

The codebase shows no direct usage of go-libp2p-core imports. Instead, it correctly uses the newer github.com/libp2p/go-libp2p/core/* path structure, as evidenced by imports like github.com/libp2p/go-libp2p/core/peer and github.com/libp2p/go-libp2p/core/crypto. The go-libp2p-core dependency is likely a transitive dependency required by other modules and its version (v0.20.1) is compatible with the custom fork of go-libp2p being used (github.com/zeta-chain/go-libp2p).

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the libp2p-core version is compatible with the replaced go-libp2p version

# Test: Search for version constraints in go-libp2p
rg -A 5 "go-libp2p-core" "go.mod"

Length of output: 157


Script:

#!/bin/bash
# Check for go-libp2p version and any replace directives
rg "github.com/libp2p/go-libp2p " go.mod

# Check for any replace directives related to libp2p
rg "^replace.*libp2p" go.mod

# Search for actual usage of libp2p-core in the codebase
rg -l "github.com/libp2p/go-libp2p-core"

# Check if there are any go.sum entries for these dependencies
rg "github.com/libp2p/go-libp2p(-core)? v" go.sum

Length of output: 501


Script:

#!/bin/bash
# Search for actual imports and usage of go-libp2p-core
rg -A 3 "github.com/libp2p/go-libp2p-core" --type go

# Check for any imports of the newer go-libp2p interfaces that might overlap
rg -A 3 "github.com/libp2p/go-libp2p" --type go | grep -v "go-libp2p-core"

# Look for any migration guides or deprecation notices in the repository
rg -i "migrate|deprecat.*libp2p" -g "*.md"

Length of output: 3325

cmd/zetaclientd/start.go Outdated Show resolved Hide resolved
cmd/zetaclientd/start.go Outdated Show resolved Hide resolved
cmd/zetaclientd/start.go Outdated Show resolved Hide resolved
zetaclient/tss/tss_signer.go Outdated Show resolved Hide resolved
zetaclient/tss/tss_signer.go Outdated Show resolved Hide resolved
@brewmaster012 brewmaster012 marked this pull request as draft October 25, 2024 04:04
zetaclient/metrics/telemetry.go Outdated Show resolved Hide resolved
zetaclient/metrics/telemetry.go Show resolved Hide resolved
Copy link
Contributor

@fbac fbac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a question to clarify.
Also, ideally we would fix the couple points mentioned by @lumtis to have a cohesive code base.

cmd/zetaclientd/start.go Outdated Show resolved Hide resolved
cmd/zetaclientd/start.go Outdated Show resolved Hide resolved
@brewmaster012 brewmaster012 added the PERFORMANCE_TESTS Run make start-e2e-performance-test label Oct 25, 2024
cmd/zetaclientd/start.go Show resolved Hide resolved
zetaclient/tss/tss_signer.go Outdated Show resolved Hide resolved
Copy link
Member

@gartnera gartnera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update/delete cmd/zetaclient/p2p_diagnostics.go and remove github.com/libp2p/go-libp2p-kad-dht imports.

Copy link
Member

@gartnera gartnera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM

cmd/zetaclientd/start.go Show resolved Hide resolved
cmd/zetaclientd/start.go Show resolved Hide resolved
@brewmaster012
Copy link
Collaborator Author

@gartnera : do you have any idea why the tss migration e2e test failed in this PR?
Is it due to changes in this PR or the test itself is flaky?

Copy link

github-actions bot commented Nov 2, 2024

!!!WARNING!!!
nosec detected in the following files: cmd/zetaclientd/start.go

Be very careful about using #nosec in code. It can be a quick way to suppress security warnings and move forward with development, it should be employed with caution. Suppressing warnings with #nosec can hide potentially serious vulnerabilities. Only use #nosec when you're absolutely certain that the security issue is either a false positive or has been mitigated in another way.

Only suppress a single rule (or a specific set of rules) within a section of code, while continuing to scan for other problems. To do this, you can list the rule(s) to be suppressed within the #nosec annotation, e.g: /* #nosec G401 */ or //#nosec G201 G202 G203
Broad #nosec annotations should be avoided, as they can hide other vulnerabilities. The CI will block you from merging this PR until you remove #nosec annotations that do not target specific rules.

Pay extra attention to the way #nosec is being used in the files listed above.

@github-actions github-actions bot added the nosec label Nov 2, 2024
@gartnera
Copy link
Member

gartnera commented Nov 4, 2024

@gartnera : do you have any idea why the tss migration e2e test failed in this PR? Is it due to changes in this PR or the test itself is flaky?

Looks like it's been failing in the nightlies for a few days

@gartnera gartnera removed the TSS_MIGRATION_TESTS Run TSS migration tests label Nov 4, 2024
@brewmaster012
Copy link
Collaborator Author

@gartnera : do you have any idea why the tss migration e2e test failed in this PR? Is it due to changes in this PR or the test itself is flaky?

Looks like it's been failing in the nightlies for a few days

Don't think it affects migration of TSS as keygen should be handled correctly.
Will ignore this test in this PR for now.

@brewmaster012 brewmaster012 added this pull request to the merge queue Nov 6, 2024
Merged via the queue into develop with commit a1fe36d Nov 6, 2024
38 of 39 checks passed
@brewmaster012 brewmaster012 deleted the dht-removal branch November 6, 2024 17:14
gartnera pushed a commit that referenced this pull request Nov 13, 2024
* import go-tss lib that removes DHT

* replace DHT with authenticated discovery

* use JSON serialization; add metric

* add new telemetry: 8123/connectedpeers; fix deadlock

in a few other 8123 handlers

* use squashed go-tss commit

* clean up interface

* address review comments

* remove whiteliste peers

* fmt

* remove rendezvous

* use merged go-tss connection gater

* use latest go-tss from PR#34

* use merged #34 in go-tss lib

* add ping RTT to telemetry

* changelog

* make linter happy

* pingrtt

* finer resolution on pingrtt time (milliseconds => nanoseconds)

* removed comments

* bump go-tss to the merged commit in master branch

* revert back to the go-tss commit

until the PR commit is merged.

---------

Co-authored-by: pwu <armiuswu@gmail.com>
gartnera added a commit that referenced this pull request Nov 14, 2024
* feat: whitelist connection gater (#3028)

* bump go tss to remove dht

* add whitelist fields

* disable whitelist for localnet

* bump go-tss

* resolve whitelisted peers wip

* dont disable whitelist in e2e tests

* cleanup whitelist fields from config and fix e2e tests

* bump go-tss

* cleanup

* bump go tss

* use node accounts to get whitelisted peers

* bump go tss

* changelog

* fix unit test

* bump go tss

* remove usage of pointers in node accounts

* fix unit test

* revert back to using keygen for whitelist peers

---------

Co-authored-by: Alex Gartner <alexg@zetachain.com>

* fix: replace DHT with private peer discovery (#3041)

* import go-tss lib that removes DHT

* replace DHT with authenticated discovery

* use JSON serialization; add metric

* add new telemetry: 8123/connectedpeers; fix deadlock

in a few other 8123 handlers

* use squashed go-tss commit

* clean up interface

* address review comments

* remove whiteliste peers

* fmt

* remove rendezvous

* use merged go-tss connection gater

* use latest go-tss from PR#34

* use merged #34 in go-tss lib

* add ping RTT to telemetry

* changelog

* make linter happy

* pingrtt

* finer resolution on pingrtt time (milliseconds => nanoseconds)

* removed comments

* bump go-tss to the merged commit in master branch

* revert back to the go-tss commit

until the PR commit is merged.

---------

Co-authored-by: pwu <armiuswu@gmail.com>

* skip depositor fee calculation on irrelevant txs

* update changelog

---------

Co-authored-by: skosito <skostic9242@gmail.com>
Co-authored-by: brewmaster012 <88689859+brewmaster012@users.noreply.github.com>
Co-authored-by: pwu <armiuswu@gmail.com>
Co-authored-by: Charlie Chen <charliec@zetachain.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking:cli nosec PERFORMANCE_TESTS Run make start-e2e-performance-test UPGRADE_LIGHT_TESTS Run make start-upgrade-test-light
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants