Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct the simd_masked_{load,store} intrinsic docs #119203

Merged
merged 1 commit into from
Feb 21, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions library/core/src/intrinsics/simd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -243,12 +243,13 @@ extern "platform-intrinsic" {
///
/// `T` must be a vector.
///
/// `U` must be a vector of pointers to the element type of `T`, with the same length as `T`.
/// `U` must be a pointer to the element type of `T`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make the argument of type *const U then? It's not necessary to make it so generic.

Same for the store operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good idea. I'll explore this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is left to a future PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we'll need to change how we validate these monomorphizations and there will be an opportunity when we move these checks out of the codegen crate.

///
/// `V` must be a vector of integers with the same length as `T` (but any element size).
///
/// For each element, if the corresponding value in `mask` is `!0`, read the corresponding
/// pointer from `ptr`.
/// pointer offset from `ptr`.
/// The first element is loaded from `ptr`, the second from `ptr.wrapping_offset(1)` and so on.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd expect it to use offset, not wrapping_offset

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire point of this is to be able to mask off loads that would be out-of-bounds. If it used offset, that would defeat the purpose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire point of this is to be able to mask off loads that would be out-of-bounds. If it used offset, that would defeat the purpose.

no, because offset is only called for those indexes that are not masked off -- sort-of. LLVM IR semantics are quite lacking here:

The ‘llvm.masked.load’ intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations. The result of this operation is equivalent to a regular vector load instruction followed by a ‘select’ between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, because offset is only called for those indexes that are not masked off

Even then I couldn't use this to do loads where the first elements are out of bounds.

The LLVM docs are unfortunately unclear on whether the pointer arithmetic has "inbounds" semantics or not, but the note about suppressing exceptions seems to indicate that out-of-bounds should be allowed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm proposing to clarify the LangRef: llvm/llvm-project#82469.

/// Otherwise if the corresponding value in `mask` is `0`, return the corresponding value from
/// `val`.
///
Expand All @@ -264,12 +265,13 @@ extern "platform-intrinsic" {
///
/// `T` must be a vector.
///
/// `U` must be a vector of pointers to the element type of `T`, with the same length as `T`.
/// `U` must be a pointer to the element type of `T`
///
/// `V` must be a vector of integers with the same length as `T` (but any element size).
///
/// For each element, if the corresponding value in `mask` is `!0`, write the corresponding
/// value in `val` to the pointer.
/// value in `val` to the pointer offset from `ptr`.
/// The first element is written to `ptr`, the second to `ptr.wrapping_offset(1)` and so on.
/// Otherwise if the corresponding value in `mask` is `0`, do nothing.
///
/// # Safety
Expand Down
Loading