Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace hacky version of calculate_max_depth with simpler one. #2320

Merged
merged 4 commits into from
Mar 18, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 6 additions & 50 deletions libraries/chain/include/eosio/chain/incremental_merkle.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,65 +3,21 @@
#include <eosio/chain/merkle.hpp>
#include <fc/io/raw.hpp>

namespace eosio { namespace chain {
namespace eosio::chain {

namespace detail {

/**
* given an unsigned integral number return the smallest
* power-of-2 which is greater than or equal to the given number
*
* @param value - an unsigned integral
* @return - the minimum power-of-2 which is >= value
*/
constexpr uint64_t next_power_of_2(uint64_t value) {
value -= 1;
value |= value >> 1;
value |= value >> 2;
value |= value >> 4;
value |= value >> 8;
value |= value >> 16;
value |= value >> 32;
value += 1;
return value;
}

/**
* Given a power-of-2 (assumed correct) return the number of leading zeros
*
* This is a classic count-leading-zeros in parallel without the necessary
* math to make it safe for anything that is not already a power-of-2
*
* @param value - and integral power-of-2
* @return the number of leading zeros
*/
constexpr int clz_power_2(uint64_t value) {
int lz = 64;

if (value) lz--;
if (value & 0x00000000FFFFFFFFULL) lz -= 32;
if (value & 0x0000FFFF0000FFFFULL) lz -= 16;
if (value & 0x00FF00FF00FF00FFULL) lz -= 8;
if (value & 0x0F0F0F0F0F0F0F0FULL) lz -= 4;
if (value & 0x3333333333333333ULL) lz -= 2;
if (value & 0x5555555555555555ULL) lz -= 1;

return lz;
}

/**
* Given a number of nodes return the depth required to store them
* in a fully balanced binary tree.
*
* @param node_count - the number of nodes in the implied tree
* @return the max depth of the minimal tree that stores them
*/
constexpr int calcluate_max_depth(uint64_t node_count) {
if (node_count == 0) {
constexpr uint64_t calculate_max_depth(uint64_t node_count) {
if (node_count == 0)
return 0;
}
auto implied_count = next_power_of_2(node_count);
return clz_power_2(implied_count) + 1;
return std::llround(std::ceil(std::log2(node_count))) + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not so sure about using floating point math for this. How about,

Suggested change
return std::llround(std::ceil(std::log2(node_count))) + 1;
return 8*sizeof(node_count) - std::countl_zero(std::bit_ceil(node_count));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Even better!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually even better is return std::bit_width(std::bit_ceil(node_count))

Copy link
Contributor Author

@greg7mdp greg7mdp Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly this return std::countr_zero(std::bit_ceil(node_count)) + 1; is slower than the countl version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised there'd be much of a difference if you've got access to lzcnt (either through -mlzcnt or something like -march=skylake) since they pretty much compile down to the same handful of instructions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, no difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually even better is return std::bit_width(std::bit_ceil(node_count))

using this!

}

template<typename ContainerA, typename ContainerB>
Expand Down Expand Up @@ -167,7 +123,7 @@ class incremental_merkle_impl {
*/
const DigestType& append(const DigestType& digest) {
bool partial = false;
auto max_depth = detail::calcluate_max_depth(_node_count + 1);
auto max_depth = detail::calculate_max_depth(_node_count + 1);
auto current_depth = max_depth - 1;
auto index = _node_count;
auto top = digest;
Expand Down Expand Up @@ -246,6 +202,6 @@ class incremental_merkle_impl {
typedef incremental_merkle_impl<digest_type> incremental_merkle;
typedef incremental_merkle_impl<digest_type,shared_vector> shared_incremental_merkle;

} } /// eosio::chain
} /// eosio::chain

FC_REFLECT( eosio::chain::incremental_merkle, (_active_nodes)(_node_count) );
Loading