Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ReadCacheLookup, 6.7 speed-up #517

Merged
merged 4 commits into from
Jan 4, 2025
Merged

Optimize ReadCacheLookup, 6.7 speed-up #517

merged 4 commits into from
Jan 4, 2025

Conversation

arvidn
Copy link
Contributor

@arvidn arvidn commented Dec 20, 2024

This is a major hot-spot when serializing with back references. This patch speeds it up by:

  • pre-allocating hash maps and hash sets
  • using IdentityHash, to avoid hashing the tree-hashes again
  • exiting early if the structure we're trying to deduplicate is too small (i.e. trade-off compression in favor of speed)
  • use BitVec instead of Vec<u8> to store paths (which only contain 0 and 1 anyway).

Compared to main, the relevant benchmarks indicate a time reduction of between 80-90%, of serializing clvm with back references.

$ cargo bench node_to_bytes_backrefs -- --baseline main
     Running benches/serialize.rs (target/release/deps/serialize-21b0d813d80dbf8a)
Benchmarking serialize/node_to_bytes_backrefs 0: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 41.6s, or reduce sample count to 10.
serialize/node_to_bytes_backrefs 0
                        time:   [417.23 ms 420.23 ms 423.98 ms]
                        change: [-85.682% -85.544% -85.390%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe
Benchmarking serialize/node_to_bytes_backrefs 1: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 48.2s, or reduce sample count to 10.
serialize/node_to_bytes_backrefs 1
                        time:   [478.85 ms 481.87 ms 485.32 ms]
                        change: [-91.144% -91.077% -91.007%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 19 outliers among 100 measurements (19.00%)
  8 (8.00%) high mild
  11 (11.00%) high severe

One concern is that, not mixing in any random state in the hasher, opens up for hash-collision attacks. I'm not sure if it's worth addressing or what the best way would be. Maybe add a random value to the IdentityHash object, and add it to the hash value.

@arvidn arvidn force-pushed the optimize-compression branch from a464e4d to 8671a04 Compare December 20, 2024 10:36
@arvidn arvidn requested a review from richardkiss December 20, 2024 10:39
Copy link

coveralls-official bot commented Dec 20, 2024

Pull Request Test Coverage Report for Build 12546252319

Details

  • 64 of 66 (96.97%) changed or added relevant lines in 2 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.01%) to 93.599%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/serde/identity_hash.rs 17 19 89.47%
Files with Coverage Reduction New Missed Lines %
src/serde/read_cache_lookup.rs 1 98.51%
Totals Coverage Status
Change from base Build 12517795391: -0.01%
Covered Lines: 5878
Relevant Lines: 6280

💛 - Coveralls

richardkiss
richardkiss previously approved these changes Dec 20, 2024
Copy link
Contributor

@richardkiss richardkiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'd be curious to know if the u8 -> bool change does anything. But it seems at worst harmless.

@arvidn
Copy link
Contributor Author

arvidn commented Dec 28, 2024

There are still some worst-case scenarios where this isn't fast enough to serialize a full block. I'm working on a more ambitions PR to speed this up more.

@arvidn
Copy link
Contributor Author

arvidn commented Dec 29, 2024

a material portion of the speed-up in this patch comes from not using a salt in the hasher. Which is a bit surprising, but I suppose we initialize new HashSet and HashMaps quite often. adding back the salt makes the benchmarks take 3x the time I wasn't doing it right. fixed now. I've added two new commits

@arvidn arvidn force-pushed the optimize-compression branch from fbb6236 to 11e100e Compare December 30, 2024 11:56
@arvidn arvidn force-pushed the optimize-compression branch from 11e100e to 32bed63 Compare December 30, 2024 12:03
@arvidn arvidn requested a review from richardkiss December 30, 2024 13:26
Copy link
Contributor

@richardkiss richardkiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@arvidn arvidn merged commit 70d7623 into main Jan 4, 2025
29 checks passed
@arvidn arvidn deleted the optimize-compression branch January 4, 2025 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants