-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: reduce calls to MinNamespace & MaxNamespace #285
Conversation
WalkthroughThe pull request introduces significant modifications to the namespace merkle tree (NMT) implementation, focusing on improving validation methods in the Changes
Sequence DiagramsequenceDiagram
participant Hasher
participant Node
participant NSRange
Hasher->>Node: tryFetchNodeNSRange
Node-->>NSRange: Extract namespace ID range
Hasher->>Node: tryFetchLeftAndRightNSRanges
Node-->>NSRange: Validate namespace ranges
Hasher->>Hasher: Compute hash
Possibly related PRs
Suggested labels
Poem
Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
hasher.go (3)
215-217
: Add documentation for the newnsIDRange
struct.Introducing a comment explaining the purpose and usage of the
nsIDRange
struct will enhance code readability and maintainability.Apply this diff to add documentation:
+ // nsIDRange represents the range of namespace IDs with minimum and maximum values. type nsIDRange struct { Min, Max namespace.ID }
263-265
: Clarify the error message in namespace ordering validation.The condition checks if
rNsRange.Min.Less(lNsRange.Max)
, implying that the right child's minimum namespace ID is less than the left child's maximum namespace ID. However, the error message states that the maximum namespace of the left child is greater than the minimum namespace of the right child, which might be confusing. Rephrasing the error message can improve clarity.Apply this diff to rephrase the error message:
- err = fmt.Errorf("%w: the maximum namespace of the left child %x is greater than the min namespace of the right child %x", ErrUnorderedSiblings, lNsRange.Max, rNsRange.Min) + err = fmt.Errorf("%w: the min namespace ID of the right child %x is less than the max namespace ID of the left child %x", ErrUnorderedSiblings, rNsRange.Min, lNsRange.Max)This adjustment aligns the error message with the condition and enhances understanding.
294-295
: Handle potential nil error inHashNode
.After calling
n.tryFetchLeftAndRightNSRanges(left, right)
, it's good practice to check for a non-nil error before using the returned values to prevent unexpected behavior.Ensure that the error handling is complete, although in this case, the error is already being checked. No action required if the current implementation is satisfactory.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
hasher.go
(3 hunks)hasher_test.go
(0 hunks)nmt.go
(1 hunks)nmt_test.go
(1 hunks)
💤 Files with no reviewable changes (1)
- hasher_test.go
✅ Files skipped from review due to trivial changes (2)
- nmt_test.go
- nmt.go
🔇 Additional comments (2)
hasher.go (2)
157-159
: Verify the change inEmptyRoot
method regarding namespace length multiplication.The
emptyNs
variable now repeats zero bytes twice the namespace length (int(n.NamespaceLen)*2
), which differs from the previous implementation. Ensure that this change is intentional and that it correctly reflects the expected format for an empty root in the NMT. This modification may affect any functionality relying on the size ofemptyNs
.
232-232
:⚠️ Potential issueCorrect formatting verbs in error message.
The
fmt.Errorf
call uses%d
formaxNID
andminNID
, which are of typenamespace.ID
(likely a byte slice). Use%x
to properly format byte slices in hexadecimal representation.Apply this diff to fix the formatting verbs:
- return nsIDRange{}, fmt.Errorf("%w: max namespace ID %d is less than min namespace ID %d ", ErrInvalidNodeNamespaceOrder, maxNID, minNID) + return nsIDRange{}, fmt.Errorf("%w: max namespace ID %x is less than min namespace ID %x", ErrInvalidNodeNamespaceOrder, maxNID, minNID)Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, this was exceptional work! I really appreciate the call out on removing the test and the detailed PR description.
hasher.go
Outdated
} | ||
|
||
// ValidateNodeFormat checks whether the supplied node conforms to the | ||
// namespaced hash format and returns ErrInvalidNodeLen if not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this function could return ErrInvalidNodeNamespaceOrder too
// namespaced hash format and returns ErrInvalidNodeLen if not. | |
// namespaced hash format and returns an error if not. |
hasher.go
Outdated
) { | ||
lNsRange, err = n.tryFetchNodeNSRange(left) | ||
if err != nil { | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uber nit: we should have a linter for this but do you mind avoiding "naked" returns? IMO they're slightly less readable (especially in long function bodies) than explicit returns.
return | |
return nsIDRange{}, nsIDRange{}, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FLUP: #286
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's another instance in this function body that could also use the explicit return
nmt_test.go
Outdated
@@ -704,6 +704,7 @@ func BenchmarkComputeRoot(b *testing.B) { | |||
{"64-leaves", 64, 8, 256}, | |||
{"128-leaves", 128, 8, 256}, | |||
{"256-leaves", 256, 8, 256}, | |||
//{"20k-leaves", 20000, 8, 512}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it only takes 6 seconds to run this benchmark with 20k leaves enabled so perhaps we uncomment this and include it
//{"20k-leaves", 20000, 8, 512}, | |
{"20k-leaves", 20000, 8, 512}, |
@rootulp @cristaloleg Thanks for the review 🙌 I've updated this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
hasher.go (2)
219-234
: Enhance error message with hex formatting.Consider using hex formatting for namespace IDs in the error message to maintain consistency with other error messages in the codebase.
- return nsIDRange{}, fmt.Errorf("%w: max namespace ID %d is less than min namespace ID %d ", ErrInvalidNodeNamespaceOrder, maxNID, minNID) + return nsIDRange{}, fmt.Errorf("%w: max namespace ID %x is less than min namespace ID %x", ErrInvalidNodeNamespaceOrder, maxNID, minNID)
244-271
: LGTM! Well-structured validation with clear error messages.The method properly validates namespace ranges and ordering of sibling nodes. Consider adding a comment explaining why we check
rNsRange.Min.Less(lNsRange.Max)
specifically.Add a comment above line 267:
// Ensure there's no overlap between left and right namespace ranges // by checking if right's minimum is less than left's maximum
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
hasher.go
(3 hunks)nmt_test.go
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- nmt_test.go
🔇 Additional comments (3)
hasher.go (3)
158-158
: LGTM! Efficient memory pre-allocation.The change to use
make
with a pre-calculated size improves memory efficiency by avoiding potential slice growth and reallocation.
214-217
: LGTM! Clean struct definition.The
nsIDRange
struct provides a clear and type-safe way to handle namespace ID ranges.
237-242
: LGTM! Excellent validation refactoring.The validation logic has been effectively centralized and simplified:
- Reduces code duplication
- Improves error handling with structured types
- Aligns with PR objectives by reducing function calls
Also applies to: 279-280, 297-307
17fcd37
to
6cb4212
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
hasher.go (2)
219-222
: Improve error documentation.The comment should explicitly mention all possible error types that can be returned.
-// It will return an ErrInvalidNodeLen | ErrInvalidNodeNamespaceOrder -// if the supplied node does not conform to the namespaced hash format. +// Returns: +// - ErrInvalidNodeLen if the node length is incorrect +// - ErrInvalidNodeNamespaceOrder if the namespace IDs are not properly ordered +// - nil if the node conforms to the namespaced hash format
257-263
: Consider early return pattern for better readability.The error handling could be simplified using early returns.
- lNsRange, err = n.tryFetchNodeNSRange(left) - if err != nil { - return lNsRange, rNsRange, err - } - rNsRange, err = n.tryFetchNodeNSRange(right) - if err != nil { - return lNsRange, rNsRange, err - } + if lNsRange, err = n.tryFetchNodeNSRange(left); err != nil { + return nsIDRange{}, nsIDRange{}, err + } + if rNsRange, err = n.tryFetchNodeNSRange(right); err != nil { + return nsIDRange{}, nsIDRange{}, err + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
hasher.go
(3 hunks)nmt_test.go
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- nmt_test.go
🔇 Additional comments (3)
hasher.go (3)
158-158
: LGTM! Efficient memory allocation.The pre-allocation of the slice with the exact required size is a good optimization.
214-217
: LGTM! Good encapsulation of namespace range.The
nsIDRange
struct provides a clean abstraction for handling namespace ID ranges.
297-307
: LGTM! Improved validation and namespace range computation.The refactoring successfully reduces calls to MinNamespace/MaxNamespace while maintaining clear validation logic. The use of the new
nsIDRange
struct and helper methods improves code organization and readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM, thanks!
In this issue #216, it was raised that
More time was spent allocating than hashing in the HashLeaf function
The benchmark test described in the issue listed above no longer exists, and when I run the
BenchmarkComputeRoot
in the nmt package HERE I can't replicate memory allocation taking longer than the hashing, although this may be related to my test parameters.Below are FlameGraphs of CPU/Mem usage (I've updated the Benchmark to have 20k leaf nodes with 512 bytes transaction size)
Base Mem Flamegraph
Base CPU Flamegraph
I've instead focused on the second issue
The min and max namespace functions were taking up a large amount of RAM
Solution 1: Reduce the number of calls to MinNamespace & MaxNamespace
The HashNode function made 4 duplicate function calls to MaxNamespace/MinNamespace for the same node.
After reducing to a single call, here is the new benchmark output (running
go test -run="none" -bench=. -benchtime=100x -benchmem
)Solution 2: Update the MinNamespace/MaxNamespace to return a slice of the existing array rather than creating a new underlying array
Copying the slice stops the possibility array capacity memory leaks if for some reason the namespace ID variable's lifetime is longer than the lifetime of the original slice, but I don't think that is the case. Slicing with a max capacity also stops the potential for appends to the namespace ID slice editing the underlying
ns|ns|hash
array.Benchmark output after this change
New Mem Flamegraph
New CPU Flamegraph
The improvements made by the first solution are made redundant by the second, but I was initially reluctant to replace the slice copy in case it would lead to a memory leak. As I got more familiar with the package, I became more confident that a slicing the array may be more appropriate.
Note
Push
method, adding anUnsafeHashNode
that skips validation was also a method I considered, but I felt that this could lead to more complications & would not make any notable differences compared to either of the other solutions.ValidateSiblings
function, I also removed the associated tests, but it looks like the same test cases also exist inTestValidateNodes
HERESummary by CodeRabbit
New Features
nsIDRange
struct for better namespace ID range management.Refactor
MinNamespace
andMaxNamespace
.Tests
ComputeRoot
with 20,000 leaves.