Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the cuda_memcpy functions to make them more usable #16945

Open
wants to merge 7 commits into
base: branch-24.12
Choose a base branch
from

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Sep 27, 2024

Description

As we expanded the use of the cuda_memcpy functions, we realized that they are not very ergonomic, as they require caller to query is_device_accessible and pass the correct PAGEABLE/PINNED enum based on this.

This PR aims to make the cuda_memcpy functions easier to use, and the call site changes hopefully showcase this. The new implementation takes spans as parameters and relies on the host_span::is_device_accessible to enable copy strategies for pinned memory. Host spans set this flag during construction; creating a host span from a cudf::detail::host_vector will correctly propagate is_device_accessible. Thus, call can simply* call the cuda_memcpy functions with their containers as parameters and rely on implicit conversion to host_span/device_span. Bonus - there's no way to mix up host and device memory pointers 👍

Sharp edges:

  • Conversion prevents template deduction, so calls that pass containers as parameters need to specify the template parameter (see changes in this PR).
  • The API copies the min(input.size(), output.size()) bytes, as this is what we can do safely. This might cause surprises to users if they unintentionally pass spans of different sizes. We could instead throw in this case.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vuule vuule added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 27, 2024
@vuule vuule self-assigned this Sep 27, 2024
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Sep 27, 2024
impl::cuda_memcpy_async(
dst.data(),
src.data(),
std::min(dst.size_bytes(), src.size_bytes()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might cause surprises to users if they unintentionally pass spans of different sizes. We could instead throw in this case.

Yes, this should be a runtime check and it should throw. If the caller wants to copy subspans, the caller can create subspans. Spans, as view types, are meant to make this easy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.
Just one thing to keep in mind for this use case:
host_span{my_hv.data(), subsize} is not the same as host_span{my_hv}.subspan(0, subsize) because the first one will not know if it's pointing to pinned memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@vuule vuule marked this pull request as ready for review September 30, 2024 16:55
@vuule vuule requested a review from a team as a code owner September 30, 2024 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants