Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements in hash_functions.cuh #10081

Closed
bdice opened this issue Jan 19, 2022 · 8 comments
Closed

Improvements in hash_functions.cuh #10081

bdice opened this issue Jan 19, 2022 · 8 comments
Assignees
Labels
0 - Backlog In queue waiting for assignment libcudf Affects libcudf (C++/CUDA) code.

Comments

@bdice
Copy link
Contributor

bdice commented Jan 19, 2022

The file hash_functions.cuh has quite a bit of room for cleanup and improvement.

  • Use std::byte more broadly.
  • Reduce the use of magic values in hash functions.
  • Separate translation units if possible -- to what extent can we avoid having a single large header?
  • Make error messages more consistent.
  • Use compute_bytes approach instead of re-implementing hash function in string_view template instantiation. (Make sure [un]aligned reads are handled correctly.)

See PRs for reference: comments in #9919, ongoing SHA work in #9215, refactors in #10379.

Additional context

What about std::byte, using std::to_integer to convert to the integer type needed at any point where we need computation other than the operators supported on std::byte (only supports bitwise operations)?

Originally posted by @harrism in #9919 (comment)

@bdice bdice self-assigned this Jan 19, 2022
@bdice bdice added the libcudf Affects libcudf (C++/CUDA) code. label Jan 19, 2022
rapids-bot bot pushed a commit that referenced this issue Feb 7, 2022
Followup to #9919 -- kernel merging and code cleanup for Murmur3 hash.

Partial fix for #10081.

Benchmarked `compute_bytes` kernel with aligned read vs unaligned read and saw no difference. Looking into it further to confirm that the `uint32_t` construction was doing the same thing implicitly.

Due to byte alignment, the string alignment will require the `getblock32` function regardless. Regardless, the benchmarks ran with 100, 103, and 104 byte strings had negligible performance differences. This reflects forced misalignment not negatively impacting the hash speed.

Authors:
  - Ryan Lee (https://github.com/rwlee)
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Christopher Harris (https://github.com/cwharris)

URL: #10143
@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@bdice bdice removed the inactive-30d label Mar 2, 2022
rapids-bot bot pushed a commit that referenced this issue Mar 8, 2022
This PR refactors a few pieces of libcudf's hash functions:
- Define the utility function `hash_combine` only once (with 32/64 bit overloads), rather than several times in the codebase
- ~Remove class template parameter from `MurmurHash3_32` and related classes. This template parameter was redundant. We already use a template for the argument of the `compute` method, which is called by `operator()`, so I put the template parameter on `operator()` instead of the whole class. I think this removal of the template parameter could be considered API-breaking so I added the `breaking` label.~ I retracted this change after conversation with @jrhemstad. I'll look into a different way to do this soon, using a dispatch-to-invoke approach as in #8217.

This addresses part of issue #10081. I have a few more things I'd like to try, but this felt like a nicely-scoped PR so I stopped here for the moment.

I benchmarked the code before and after making these changes and saw a small but consistent decrease in runtime.

The benchmarks in `HashBenchmark/{HASH_MURMUR3,HASH_SERIAL_MURMUR3,HASH_SPARK_MURMUR3}_{nulls,no_nulls}/*` all decreased or saw no change in runtime, with a geometric mean of 2.87% less time.

The benchmarks in `Hashing/hash_partition/*` all decreased or saw no change in runtime, with a geometric mean of 2.37% less time.

For both sets of benchmarks, the largest data sizes saw more significant decreases in runtime, with a best-improvement of 7.38% less time in `HashBenchmark/HASH_MURMUR3_nulls/16777216` (similar for other large data sizes) and a best-improvement of 10.54% less time in `Hashing/hash_partition/1048576/256/64` (similar for other large data sizes).

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Conor Hoekstra (https://github.com/codereport)

URL: #10379
@github-actions
Copy link

github-actions bot commented Apr 1, 2022

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

rapids-bot bot pushed a commit that referenced this issue Apr 20, 2022
Additional work related to #10081.

This is breaking because it reorganizes several public names/namespaces.

Summary of changes in this PR:
- The `cudf` namespace now wraps the contents of `hash_functions.cuh`, and some public names are now classified as `detail` APIs.
- `SparkMurmurHash3_32` has been updated to align with the design and naming conventions of `MurmurHash3_32`

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Jake Hemstad (https://github.com/jrhemstad)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #10462
@github-actions
Copy link

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

@bdice
Copy link
Contributor Author

bdice commented Jul 21, 2022

This is still relevant. I aim to work on a few more of these ideas after #11296, when I can also move SparkMurmurHash3_32 out of hash_functions.cuh. See also: #11292 (comment)

edit: that move is done in #11489.

@bdice
Copy link
Contributor Author

bdice commented Jul 27, 2022

We should also add documentation to detail methods in hashing.hpp.

rapids-bot bot pushed a commit that referenced this issue Aug 9, 2022
This PR moves the `SparkMurmurHash3_32` functor from `hash_functions.cuh` to `spark_murmur_hash.cu`, the only place where it is used. **This is a pure move**, with one small exception to avoid compiler warnings about unused members of the hash functor template instantiations for nested types. I refactored the class template to disallow nested types for the hash functor and removed those specializations using `CUDF_UNREACHABLE`, rather than allowing type dispatching to create template instantiations that have no defined use. (Nested types are being handled by the custom device row hasher in `spark_murmur_hash.cu`, and require some state information that cannot be easily carried in the functor itself.) I am planning to do further refactoring later, but wanted to separate this "pure move" as much as possible.

Part of #10081.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Ryan Lee (https://github.com/rwlee)

URL: #11489
@GregoryKimball
Copy link
Contributor

@bdice We've made some improvements to hash_functions.cuh, should we close this?

@GregoryKimball GregoryKimball added 0 - Backlog In queue waiting for assignment tech debt labels Sep 27, 2023
@vyasr vyasr removed the tech debt label Feb 23, 2024
@vyasr
Copy link
Contributor

vyasr commented May 13, 2024

@bdice do you still want to do anything here?

@bdice
Copy link
Contributor Author

bdice commented May 13, 2024

We haven't adopted std::byte in some places where it might be relevant, but I also don't think this is important or urgent work. I'll close this.

@bdice bdice closed this as completed May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 - Backlog In queue waiting for assignment libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

3 participants