Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Commit

Permalink
Merge pull request #201 from gonzalobg/docs/aligned_size_t
Browse files Browse the repository at this point in the history
Clarify documentation of aligned_size_t
  • Loading branch information
wmaxey authored Mar 16, 2022
2 parents 0771c26 + 11cc45c commit ec353ac
Showing 1 changed file with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions docs/extended_api/shapes/aligned_size_t.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,11 @@ struct cuda::aligned_size_t {
The class template `cuda::aligned_size_t` is a _shape_ representing an extent
of bytes with a statically defined (address and size) alignment.

*Preconditions*:

- The _address_ of the extent of bytes must be aligned to an `Alignment` alignment boundary.
- The _size_ of the extent of bytes must be a multiple of the `Alignment`.

## Template Parameters

| `Alignment` | The address and size alignement of the byte extent. |
Expand Down Expand Up @@ -52,8 +57,8 @@ __global__ void example_kernel(void* dst, void* src, size_t size) {
// Implementation cannot make assumptions about alignment.
cuda::memcpy_async(dst, src, size, bar);
// Implementation can assume that dst, src and size are 16-bytes aligned and
// may optimize accordingly.
// Implementation can assume that dst and src are 16-bytes aligned,
// and that size is a multiple of 16, and may optimize accordingly.
cuda::memcpy_async(dst, src, cuda::aligned_size_t<16>(size), bar);
bar.arrive_and_wait();
Expand Down

0 comments on commit ec353ac

Please sign in to comment.