Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use XXH32 instead of sha256 for const hashing #1393

Merged
merged 5 commits into from
Sep 19, 2022

Conversation

xermicus
Copy link
Contributor

@xermicus xermicus commented Sep 13, 2022

Would it be feasible to use FNV xxhash instead of sha256 for Key hashing, since we use only 32bits of the sha hash anyways? With FNV xxhash being a much simpler algorithm, it would allow us to use ink_metadata with stable rust without relying on RUSTC_BOOTSTRAP. Also what do you think @xgreenx ?

@xgreenx
Copy link
Collaborator

xgreenx commented Sep 13, 2022

I thought about this solution(in the first implementation I used a custom hash function). The problem is that the chance of getting collision is higher, and it will make the implementation of the same storage key calculation for other languages harder because they need find implementation of this not popular function=) The same for documentation, we need to describe the usage of an unusual hashing algorithm.

But if other members of the ink! team like this idea, of course, go with it=)

@xermicus
Copy link
Contributor Author

xermicus commented Sep 14, 2022

Thanks for your feedback! At first I shared your concerns about collisions, but as I thought a bit about it I came to a different conclusion. My guess was that reducing sha2 to 32 bits only has significant impact on its collision rate as well. And FNV1A is actually not that bad: According to this stack overflow answer, there are only ~4 collisions per ~200K hashes of UUIDS.

This left me wondering, how bad is FNV1A compared to 32bit SHA256? I quickly hacked together my own very small and totally unscientific experiment. If you run this multiple times, you'll observe that the SHA2 option does only slightly better than FNV1a. In fact, sometimes there are even more collisions for SHA2:

Testing for 200000 UUIDs with 32 runs...
FNV1A collisions: 137 (avg. 4.28125 per 200000 UUIDs)
SHA2_32 collisions: 155 (avg. 4.84375 per 200000 UUIDs)

Based on that it seems like we don't gain much in terms of collision resistance by using 32bit SHA2 instead of FNV1A. This only applies to hashing UUIDs though (behavior in our use case might be different!).

the usage of an unusual hashing algorithm

I don't think FNV is "unusual". To my knowledge it is a solid (in terms of speed vs collision resistance) and often used non cryptographic hash.

But if other members of the ink! team like this idea, of course, go with it=)

I'd be very happy to hear some more opinions on this one as well 😊

@xermicus xermicus added A-ink_storage [ink_storage] Work Item C-discussion An issue for discussion for a given topic. E-in-progress A task that is already being worked on. labels Sep 16, 2022
@xermicus xermicus added dependencies Pull requests that update a dependency file and removed E-in-progress A task that is already being worked on. labels Sep 16, 2022
@xermicus xermicus marked this pull request as ready for review September 16, 2022 08:59
Copy link
Collaborator

@ascjones ascjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favour of this because it is a huge headache to require people who depend on ink std libs to run nightly or RUSTC_BOOTSTRAP=1.

The numbers and posts above seem to be encouraging. However I am not an expert on the suitability or not of hashing algorithms for unique key generation, so I would like somebody else here with more knowledge to approve this.

crates/primitives/Cargo.toml Outdated Show resolved Hide resolved
@xermicus xermicus changed the title Use FNV instead of sha256 for const hashing Use XXH32 instead of sha256 for const hashing Sep 16, 2022
@xermicus
Copy link
Contributor Author

As discussed internall, switched to xxhash32, since it is used already in substrate for non cryptographic hashes.

I used 0 as seed value, it is a valid seed and seems to be the common choice (xxh32sum utility and LZ4 compression use 0 as seed for example).

@xermicus xermicus merged commit 52c0aa0 into master Sep 19, 2022
@xermicus xermicus deleted the cl/const-storage-key-with-stable-rust branch September 19, 2022 10:35
@@ -69,8 +65,7 @@ impl KeyComposer {
/// 1. If `variant_name` is not empty then computes the ASCII byte representation and call it `V`.
/// 1. Compute the ASCII byte representation of `field_name` and call it `F`.
/// 1. Concatenate (`S` and `F`) or (`S`, `V` and `F`) using `::` as separator and call it `C`.
/// 1. Apply the `SHA2` 256-bit hash `H` of `C`.
/// 1. The first 4 bytes of `H` make up the storage key.
/// 1. The `FNV1A` 32-bit hash of `C` is the storage key.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xermicus this should be XXHASH instead, eh?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! 🙈

This was referenced Sep 20, 2022
@ascjones ascjones mentioned this pull request Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ink_storage [ink_storage] Work Item C-discussion An issue for discussion for a given topic. dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants