Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hash_size doesn't match hash length when using DoubleGradient #46

Open
hugopeixoto opened this issue Jul 6, 2021 · 1 comment
Open

Comments

@hugopeixoto
Copy link

Hi there,

I am trying out the DoubleGradient hash algorithm. I expected the hash_size() passed to HasherConfig to be respected (assuming width and height being multiples of 2), but the resulting hashes have fewer bits than that. Here's a snippet of code and the resulting output:

let image = image::open("grayscale.png").unwrap();

for (w,h) in [(8,8), (16,16), (8,16), (16,8)] {
  let hasher = HasherConfig::new().hash_size(w,h).hash_alg(HashAlg::Gradient).to_hasher();
    println!("Gradient({}, {}): {:?} bits", w, h, 8 * hasher.hash_image(&image).as_bytes().len());

    let hasher = HasherConfig::new().hash_size(w,h).hash_alg(HashAlg::DoubleGradient).to_hasher();
    println!("DoubleGradient({}, {}): {:?} bits", w, h, 8 * hasher.hash_image(&image).as_bytes().len());
}

I also added a println inside hash_image to print bytes.len(), resize_width, and resize_height.

HashVals: 72 (9x8?)
Gradient(8, 8): 64 bits
HashVals: 25 (5x5?)
DoubleGradient(8, 8): 40 bits
HashVals: 272 (17x16?)
Gradient(16, 16): 256 bits
HashVals: 81 (9x9?)
DoubleGradient(16, 16): 144 bits
HashVals: 144 (9x16?)
Gradient(8, 16): 128 bits
HashVals: 45 (5x9?)
DoubleGradient(8, 16): 80 bits
HashVals: 136 (17x8?)
Gradient(16, 8): 128 bits
HashVals: 45 (9x5?)
DoubleGradient(16, 8): 80 bits

Both 8 and 16 are multiples of two, so I didn't expect any changes when using DoubleGradient. I think this is a bug, but I wasn't able to pinpoint the problem yet.

I tried both with img_hash 3.2.0 and with the latest commit on the main branch, which seems to be the same.

@hugopeixoto
Copy link
Author

Initially I didn't understand how DoubleGradient works. Now I see that it is a concatenation of the Horizontal Gradient and the Vertical Gradient hashes, with both calculated at a smaller size so that the two of them don't exceed the original dimensions.

When given a hash dimension of 8x8 (for example), img_hash resizes the image to 5x5 (8/2-1, 8/2-1). It then applies both gradients to the same resized image, both producing a 5x4 hash. These are then concatenated together and a 40 bit hash is returned.

With this way of constructing a double gradient, it's probably impossible to respect every possible original dimensions. Maybe the documentation could be updated to reflect that the specified dimensions are an upper bound, in this case? Or the hash could be zero padded at the end?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant