You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
lambda opened this issue
Oct 11, 2018
· 0 comments
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.I-slowIssue: Problems and improvements with respect to performance of generated code.
An example on Compiler Explorer uses nightly only and unsafe intrinsics to assert alignment of some arrays:
#![feature(core_intrinsics)]// Requires the use of the nightly rust// Compile with -Opubfnmax_array(x:&mut[f64;65536],y:&[f64;65536]){unsafe{
std::intrinsics::assume(x.as_ptr()asusize % 64 == 0);
std::intrinsics::assume(y.as_ptr()asusize % 64 == 0);}for i in0..65536{
x[i] = if y[i] > x[i]{ y[i]}else{ x[i]};}}
The code optimizes to some more efficient vector operations if you can assume that the input arrays have the given alignment.
Without std::intrinsics::assume, it uses instructions that don't assume alignment and are presumably slower, as well as requiring more loads to registers:
Since there is support for #[align(64)] now, I thought that it would be possible to do this type safely and without using any feature flags by adding that to a wrapper struct, and sure enough, we can get the same optimization with the following (result not shown since it's identical):
#[repr(align(64))]pubstructAlignedArray([f64;65536]);pubfnmax_array(x:&mutAlignedArray,y:&AlignedArray){for i in0..65536{
x.0[i] = if y.0[i] > x.0[i]{ y.0[i]}else{ x.0[i]};}}
However, that has the slight downside of needing to use .0 instead of just indexing on these wrapper structs. We can fix that by implementing Index and IndexMut, which I would expect to be a zero-cost abstraction. However, even though the code is still inlined and vectorized with this approach, it lost the benefit of knowing the alignment and compiled as the version with no alignment information (shown earlier):
use std::ops::{Index,IndexMut};#[repr(align(64))]pubstructAlignedArray([f64;65536]);implIndex<usize>forAlignedArray{typeOutput = f64;#[inline]fnindex(&self,i:usize) -> &f64{&self.0[i]}}implIndexMut<usize>forAlignedArray{#[inline]fnindex_mut(&mutself,i:usize) -> &mutf64{&mutself.0[i]}}pubfnmax_array(x:&mutAlignedArray,y:&AlignedArray){for i in0..65536{
x[i] = if y[i] > x[i]{ y[i]}else{ x[i]};}}
According to Compiler Explorer, this is all using rustc 1.31.0-nightly (96cafc5 2018-10-09). Also worth noting that the ability to produce the alignment optimized version using the wrapper struct with #[align(64)] only shows up in the nightly compiler; beta and released compilers give the unaligned version even without the use if Index/IndexMut.
The text was updated successfully, but these errors were encountered:
Havvy
added
the
I-slow
Issue: Problems and improvements with respect to performance of generated code.
label
Oct 11, 2018
nikic
added
the
A-LLVM
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.
label
Dec 20, 2018
Enable emission of alignment attrs for pointer params
Instead disable creation of assumptions during inlining using an LLVM opt flag. For non-inlined functions, this gives us alignment information, while not inserting any assumes that kill other optimizations.
The `-Z arg-align-attributes` option which previously controlled this behavior is removed.
Fixesrust-lang#54982.
r? @nagisa
cc @eddyb who added the current behavior, and @scottmcm, who added the `-Z arg-align-attributes` flag.
Centril
added a commit
to Centril/rust
that referenced
this issue
Dec 24, 2018
Enable emission of alignment attrs for pointer params
Instead disable creation of assumptions during inlining using an LLVM opt flag. For non-inlined functions, this gives us alignment information, while not inserting any assumes that kill other optimizations.
The `-Z arg-align-attributes` option which previously controlled this behavior is removed.
Fixesrust-lang#54982.
r? @nagisa
cc @eddyb who added the current behavior, and @scottmcm, who added the `-Z arg-align-attributes` flag.
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.I-slowIssue: Problems and improvements with respect to performance of generated code.
An example on Compiler Explorer uses nightly only and unsafe intrinsics to assert alignment of some arrays:
The code optimizes to some more efficient vector operations if you can assume that the input arrays have the given alignment.
With
std::intrinsics::assume
:Without
std::intrinsics::assume
, it uses instructions that don't assume alignment and are presumably slower, as well as requiring more loads to registers:Since there is support for
#[align(64)]
now, I thought that it would be possible to do this type safely and without using any feature flags by adding that to a wrapper struct, and sure enough, we can get the same optimization with the following (result not shown since it's identical):However, that has the slight downside of needing to use
.0
instead of just indexing on these wrapper structs. We can fix that by implementingIndex
andIndexMut
, which I would expect to be a zero-cost abstraction. However, even though the code is still inlined and vectorized with this approach, it lost the benefit of knowing the alignment and compiled as the version with no alignment information (shown earlier):Try it yourself on Compiler Explorer.
According to Compiler Explorer, this is all using rustc 1.31.0-nightly (96cafc5 2018-10-09). Also worth noting that the ability to produce the alignment optimized version using the wrapper struct with
#[align(64)]
only shows up in the nightly compiler; beta and released compilers give the unaligned version even without the use ifIndex
/IndexMut
.The text was updated successfully, but these errors were encountered: