Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix: reduce calls to MinNamespace & MaxNamespace (#285)
In this issue #216, it was raised that 1. More time was spent allocating than hashing in the HashLeaf function 2. The min and max namespace functions were taking up a large amount of RAM --- ## More time was spent allocating than hashing in the HashLeaf function The benchmark test described in the issue listed above no longer exists, and when I run the `BenchmarkComputeRoot` in the nmt package [HERE](https://github.com/bruscar010/nmt/blob/main/nmt_test.go#L696) I can't replicate memory allocation taking longer than the hashing, although this may be related to my test parameters. Below are FlameGraphs of CPU/Mem usage (I've updated the Benchmark to have 20k leaf nodes with 512 bytes transaction size) [Base Mem Flamegraph](https://flamegraph.com/share/6a04ffab-c866-11ef-9832-26c3e5347170) [Base CPU Flamegraph](https://flamegraph.com/share/efce1d16-c868-11ef-b027-3a58f36d25ce) ``` BenchmarkComputeRoot/64-leaves-12 100 141785 ns/op 65166 B/op 1535 allocs/op BenchmarkComputeRoot/128-leaves-12 100 280652 ns/op 123894 B/op 3074 allocs/op BenchmarkComputeRoot/256-leaves-12 100 501392 ns/op 254611 B/op 6154 allocs/op ``` I've instead focused on the second issue --- ## The min and max namespace functions were taking up a large amount of RAM ### Solution 1: Reduce the number of calls to MinNamespace & MaxNamespace The HashNode function made 4 duplicate function calls to MaxNamespace/MinNamespace for the same node. ``` 1. HashNode -> ValidateNodes -> ValidateNodeFormat.Min/MaxNamespace 2. HashNode -> ValidateNodes -> validateSiblingsNamespaceOrder -> ValidateNodeFormat(left|right).Min/MaxNamespace 3. HashNode -> ValidateNodes -> validateSiblingsNamespaceOrder.Min/MaxNamespace 4. HashNode.Min/MaxNamespace ``` After reducing to a single call, here is the new benchmark output (running `go test -run="none" -bench=. -benchtime=100x -benchmem`) ``` BenchmarkComputeRoot/64-leaves-12 100 137292 ns/op 60100 B/op 905 allocs/op BenchmarkComputeRoot/128-leaves-12 100 237664 ns/op 113758 B/op 1804 allocs/op BenchmarkComputeRoot/256-leaves-12 100 462802 ns/op 234241 B/op 3604 allocs/op ``` ### Solution 2: Update the MinNamespace/MaxNamespace to return a slice of the existing array rather than creating a new underlying array Copying the slice stops the possibility [array capacity memory leaks](https://itnext.io/mastering-memory-management-in-go-avoiding-slice-related-leaks-7d5e97ee48d4) if for some reason the namespace ID variable's lifetime is longer than the lifetime of the original slice, but I don't think that is the case. [Slicing with a max capacity also stops the potential for appends to the namespace ID slice editing the underlying](https://build-your-own.org/blog/20230316_go_full_slice/) `ns|ns|hash` array. Benchmark output after this change ``` BenchmarkComputeRoot/64-leaves-12 100 136436 ns/op 58180 B/op 653 allocs/op BenchmarkComputeRoot/128-leaves-12 100 241836 ns/op 109747 B/op 1297 allocs/op BenchmarkComputeRoot/256-leaves-12 100 470231 ns/op 225924 B/op 2583 allocs/op ``` [New Mem Flamegraph](https://flamegraph.com/share/71b8ef18-c86a-11ef-b027-3a58f36d25ce) [New CPU Flamegraph](https://flamegraph.com/share/7f46d1b4-c86a-11ef-9832-26c3e5347170) --- The improvements made by the first solution are made redundant by the second, but I was initially reluctant to replace the slice copy in case it would lead to a memory leak. As I got more familiar with the package, I became more confident that a slicing the array may be more appropriate. **Note** - Given that validation is done in the `Push` method, adding an `UnsafeHashNode` that skips validation was also a method I considered, but I felt that this could lead to more complications & would not make any notable differences compared to either of the other solutions. - After reworking the `ValidateSiblings` function, I also removed the associated tests, but it looks like the same test cases also exist in `TestValidateNodes` [HERE](https://github.com/bruscar010/nmt/blob/main/hasher_test.go#L684) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced namespace validation methods with improved error reporting and context. - Introduced `nsIDRange` struct for better namespace ID range management. - **Refactor** - Streamlined namespace-related functions in `MinNamespace` and `MaxNamespace`. - Improved code readability and maintainability in namespace merkle tree hasher. - **Tests** - Added a benchmark for `ComputeRoot` with 20,000 leaves. - Removed existing sibling node validation test. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: francis <28549164+FrancisLennon17@users.noreply.github.com>
- Loading branch information