C++: Use little-endian load for std::hash #561

chfast · 2020-11-02T12:49:39Z

This replaces the big-endian loads with little-endian loads in hash functions for evmc::address and evmc::bytes32.
Performance improvements are significant.

hash_<evmc::bytes32, hash<evmc::bytes32>>_mean                           -0.2973         -0.2973          2335          1641          2335          1641
hash_<evmc::bytes32, noinline_hash<evmc::bytes32>>_mean                  -0.1559         -0.1559          3045          2571          3045          2571
hash_<evmc::address, hash<evmc::address>>_mean                           -0.4009         -0.4009          1323           793          1323           793
hash_<evmc::address, noinline_hash<evmc::address>>_mean                  -0.2762         -0.2762          1955          1415          1955          1415

Originally, I also tried much simpler word folding fold(a, b): 3*a + b. These hashes does not look very random any more, and hash of zero is zero. Furthermore, it only improves performance (over little-endian version) for hash functions inlined in a loop, what is probably not the case for hash maps.

hash_<evmc::bytes32, hash<evmc::bytes32>>_mean                           -0.1432         -0.1432          1641          1406          1641          1406
hash_<evmc::bytes32, noinline_hash<evmc::bytes32>>_mean                  -0.0006         -0.0006          2571          2569          2571          2569
hash_<evmc::address, hash<evmc::address>>_mean                           -0.1896         -0.1896           793           643           793           643
hash_<evmc::address, noinline_hash<evmc::address>>_mean                  -0.0087         -0.0087          1415          1403          1415          1403

We can revisit more optimizations here, but we should build some hashmap performance testing up front (e.g. see https://stackoverflow.com/a/62345875/725174).

chfast · 2020-11-02T13:06:54Z

TODO: std::hash unit tests are pretty bad - changing BE load to LE produces the same value for the given test cases.

yperbasis

FNV calls take a negligible fraction of Silkworm execution, so this change probably won't make a difference to the total block execution time.

axic · 2020-11-02T16:24:15Z

include/evmc/evmc.hpp

@@ -827,31 +815,25 @@ namespace std
 template <>
 struct hash<evmc::address>
 {
-    /// Hash operator using FNV1a-based folding.
+    /// Hash operator using (3a + b) folding of the address "words".


What is 3a + b? Some homebrew hashing?

Kindof. We have this progression of options:

Fold all words with XOR.

Fold all words with ADD. A bit better than XOR because discards less information. But also symmetric.

"Classic" multiply by prime/odd number and add: fold(a,b) { return 3*a + b }.

The 3 is used because has the same performance as 1 and 2. The multiply is done by lea instruction and the throughput is the same because of the executing multiple instructions in the same time. I.e. latency of "multiply" is hidden.

How would 1) and 2) work? The same bytes in a different order would result in the same hash. Or do you mean not only xor/add, but some shifting/etc.

Just word0 ^ word1 ^ word2 ^ word3. Similarly add.

If I remember correctly, we discussed that this is only used as a quick lookup, but the actual data is then compared at a match, so clashes do not matter.

codecov-io · 2020-11-02T20:46:49Z

Codecov Report

Merging #561 into master will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #561      +/-   ##
==========================================
- Coverage   91.31%   91.30%   -0.01%     
==========================================
  Files          22       22              
  Lines        3119     3118       -1     
==========================================
- Hits         2848     2847       -1     
  Misses        271      271

chfast · 2020-11-02T20:54:22Z

FNV calls take a negligible fraction of Silkworm execution, so this change probably won't make a difference to the total block execution time.

Using FNV is pretty solid. Would be nice to confirm if your hashmap is using std::hash and benchmark this change with silkworm.

yperbasis · 2020-11-04T09:34:03Z

FNV calls take a negligible fraction of Silkworm execution, so this change probably won't make a difference to the total block execution time.

Using FNV is pretty solid. Would be nice to confirm if your hashmap is using std::hash and benchmark this change with silkworm.

I've checked and the hashmap does use std::hash. There's a tiny performance gain: 0.1 h win out of 16.5 h of executing the first 11M blocks.

axic

I'm indifferent on this. At least the first commit for adding more tests should be merged.

chfast · 2021-02-16T10:33:15Z

In the final version there is only switch to little-endian loading. See the updated description.

chfast force-pushed the optimize_cpp_hash branch from 6bad478 to 8020d35 Compare November 2, 2020 12:57

chfast requested review from axic, gumb0 and yperbasis November 2, 2020 14:00

yperbasis reviewed Nov 2, 2020

View reviewed changes

axic reviewed Nov 2, 2020

View reviewed changes

Base automatically changed from optimize_cpp_compare to master November 2, 2020 19:45

chfast force-pushed the optimize_cpp_hash branch from 8020d35 to d0a8f5b Compare November 2, 2020 20:42

axic approved these changes Feb 15, 2021

View reviewed changes

test: Add endian-sensitive std::hash tests

d8426d2

chfast force-pushed the optimize_cpp_hash branch 2 times, most recently from a0fdd25 to a138c51 Compare February 16, 2021 10:10

cpp: Use little-endian load for std::hash

f124cc8

chfast force-pushed the optimize_cpp_hash branch from a138c51 to f124cc8 Compare February 16, 2021 10:28

chfast merged commit b606331 into master Feb 16, 2021

chfast deleted the optimize_cpp_hash branch February 16, 2021 10:35

chfast changed the title ~~C++: Use simpler 3a + b folding in std::hash~~ C++: Use little-endian load for std::hash Feb 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C++: Use little-endian load for std::hash #561

C++: Use little-endian load for std::hash #561

chfast commented Nov 2, 2020 •

edited

Loading

chfast commented Nov 2, 2020

yperbasis left a comment

axic Nov 2, 2020

chfast Nov 2, 2020

axic Nov 2, 2020

chfast Nov 2, 2020

axic Feb 15, 2021

codecov-io commented Nov 2, 2020

chfast commented Nov 2, 2020

yperbasis commented Nov 4, 2020

axic left a comment

chfast commented Feb 16, 2021

C++: Use little-endian load for std::hash #561

C++: Use little-endian load for std::hash #561

Conversation

chfast commented Nov 2, 2020 • edited Loading

chfast commented Nov 2, 2020

yperbasis left a comment

Choose a reason for hiding this comment

axic Nov 2, 2020

Choose a reason for hiding this comment

chfast Nov 2, 2020

Choose a reason for hiding this comment

axic Nov 2, 2020

Choose a reason for hiding this comment

chfast Nov 2, 2020

Choose a reason for hiding this comment

axic Feb 15, 2021

Choose a reason for hiding this comment

codecov-io commented Nov 2, 2020

Codecov Report

chfast commented Nov 2, 2020

yperbasis commented Nov 4, 2020

axic left a comment

Choose a reason for hiding this comment

chfast commented Feb 16, 2021

chfast commented Nov 2, 2020 •

edited

Loading