Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace hacky version of calculate_max_depth with simpler one. #2320

Merged
merged 4 commits into from
Mar 18, 2024

Conversation

greg7mdp
Copy link
Contributor

@greg7mdp greg7mdp commented Mar 18, 2024

In PR #2314, I asked why we were using overly complex code to do what was basically a log base 2.
Gnome mentioned that the original code was in incremental_merkle_tree.hpp.

So I propose this simplification. The replacement provides exactly the same result as the previous implementation, as the following test program demonstrates:

#include <cmath>
#include <cstdlib>

static uint64_t next_power_of_2(uint64_t value) {
   value -= 1;
   value |= value >> 1;
   value |= value >> 2;
   value |= value >> 4;
   value |= value >> 8;
   value |= value >> 16;
   value |= value >> 32;
   value += 1;   return value;
}

constexpr uint64_t clz_power_2(uint64_t value) {
   int lz = 64;

   if (value) lz--;
   if (value & 0x00000000FFFFFFFFULL) lz -= 32;
   if (value & 0x0000FFFF0000FFFFULL) lz -= 16;
   if (value & 0x00FF00FF00FF00FFULL) lz -= 8;
   if (value & 0x0F0F0F0F0F0F0F0FULL) lz -= 4;
   if (value & 0x3333333333333333ULL) lz -= 2;
   if (value & 0x5555555555555555ULL) lz -= 1;

   return lz;
}

constexpr uint64_t calculate_max_depth(uint64_t node_count) {
   if (node_count == 0) {
      return 0;
   }
   auto implied_count = next_power_of_2(node_count);
   return clz_power_2(implied_count) + 1;
}

static uint64_t calculate_max_depth2(uint64_t node_count) {
   if (node_count == 0)
      return 0;
   return std::llround(std::ceil(std::log2(node_count))) + 1;
}

int main(int argc, char **argv)
{
   for (size_t i=0; i<100'000'000; ++i)
      if (auto a = calculate_max_depth(i), b = calculate_max_depth2(i); a != b) {
         printf("fails for i=%ld (%ld != %ld)\n", i, a, b);
         exit(0);
      }
   printf("success\n");
   return 0;
}

}
auto implied_count = next_power_of_2(node_count);
return clz_power_2(implied_count) + 1;
return std::llround(std::ceil(std::log2(node_count))) + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not so sure about using floating point math for this. How about,

Suggested change
return std::llround(std::ceil(std::log2(node_count))) + 1;
return 8*sizeof(node_count) - std::countl_zero(std::bit_ceil(node_count));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Even better!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually even better is return std::bit_width(std::bit_ceil(node_count))

Copy link
Contributor Author

@greg7mdp greg7mdp Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly this return std::countr_zero(std::bit_ceil(node_count)) + 1; is slower than the countl version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised there'd be much of a difference if you've got access to lzcnt (either through -mlzcnt or something like -march=skylake) since they pretty much compile down to the same handful of instructions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, no difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually even better is return std::bit_width(std::bit_ceil(node_count))

using this!

@ericpassmore
Copy link
Contributor

Note:start
group: STABILITY
category: INTERNALS
summary: Simplification of merkle tree max depth calculation.
Note:end

@greg7mdp greg7mdp merged commit 3868b80 into main Mar 18, 2024
34 checks passed
@greg7mdp greg7mdp deleted the incr_merkle branch March 18, 2024 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants