Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate MXCSR-related intrinsics #817

Closed
wants to merge 4 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
232 changes: 49 additions & 183 deletions crates/core_arch/src/x86/sse.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1375,6 +1375,10 @@ pub unsafe fn _mm_sfence() {
#[target_feature(enable = "sse")]
#[cfg_attr(test, assert_instr(stmxcsr))]
#[stable(feature = "simd_x86", since = "1.27.0")]
#[rustc_deprecated(
since = "1.59.0",
reason = "accessing floating point exception state causes undefined behavior in LLVM, see https://github.com/rust-lang/stdarch/issues/781"
)]
pub unsafe fn _mm_getcsr() -> u32 {
let mut result = 0_i32;
stmxcsr((&mut result) as *mut _ as *mut i8);
Expand All @@ -1383,135 +1387,17 @@ pub unsafe fn _mm_getcsr() -> u32 {

/// Sets the MXCSR register with the 32-bit unsigned integer value.
///
/// This register constrols how SIMD instructions handle floating point
/// operations. Modifying this register only affects the current thread.
///
/// It contains several groups of flags:
///
/// * *Exception flags* report which exceptions occurred since last they were
/// reset.
///
/// * *Masking flags* can be used to mask (ignore) certain exceptions. By
/// default
/// these flags are all set to 1, so all exceptions are masked. When an
/// an exception is masked, the processor simply sets the exception flag and
/// continues the operation. If the exception is unmasked, the flag is also set
/// but additionally an exception handler is invoked.
///
/// * *Rounding mode flags* control the rounding mode of floating point
/// instructions.
///
/// * The *denormals-are-zero mode flag* turns all numbers which would be
/// denormalized (exponent bits are all zeros) into zeros.
///
/// ## Exception Flags
///
/// * `_MM_EXCEPT_INVALID`: An invalid operation was performed (e.g., dividing
/// Infinity by Infinity).
///
/// * `_MM_EXCEPT_DENORM`: An operation attempted to operate on a denormalized
/// number. Mainly this can cause loss of precision.
///
/// * `_MM_EXCEPT_DIV_ZERO`: Division by zero occurred.
///
/// * `_MM_EXCEPT_OVERFLOW`: A numeric overflow exception occurred, i.e., a
/// result was too large to be represented (e.g., an `f32` with absolute
/// value
/// greater than `2^128`).
///
/// * `_MM_EXCEPT_UNDERFLOW`: A numeric underflow exception occurred, i.e., a
/// result was too small to be represented in a normalized way (e.g., an
/// `f32`
/// with absulte value smaller than `2^-126`.)
///
/// * `_MM_EXCEPT_INEXACT`: An inexact-result exception occurred (a.k.a.
/// precision exception). This means some precision was lost due to rounding.
/// For example, the fraction `1/3` cannot be represented accurately in a
/// 32 or 64 bit float and computing it would cause this exception to be
/// raised. Precision exceptions are very common, so they are usually masked.
///
/// Exception flags can be read and set using the convenience functions
/// `_MM_GET_EXCEPTION_STATE` and `_MM_SET_EXCEPTION_STATE`. For example, to
/// check if an operation caused some overflow:
///
/// ```rust,ignore
/// _MM_SET_EXCEPTION_STATE(0); // clear all exception flags
/// // perform calculations
/// if _MM_GET_EXCEPTION_STATE() & _MM_EXCEPT_OVERFLOW != 0 {
/// // handle overflow
/// }
/// ```
///
/// ## Masking Flags
///
/// There is one masking flag for each exception flag: `_MM_MASK_INVALID`,
/// `_MM_MASK_DENORM`, `_MM_MASK_DIV_ZERO`, `_MM_MASK_OVERFLOW`,
/// `_MM_MASK_UNDERFLOW`, `_MM_MASK_INEXACT`.
///
/// A single masking bit can be set via
///
/// ```rust,ignore
/// _MM_SET_EXCEPTION_MASK(_MM_MASK_UNDERFLOW);
/// ```
///
/// However, since mask bits are by default all set to 1, it is more common to
/// want to *disable* certain bits. For example, to unmask the underflow
/// exception, use:
///
/// ```rust,ignore
/// _mm_setcsr(_mm_getcsr() & !_MM_MASK_UNDERFLOW); // unmask underflow
/// exception
/// ```
///
/// Warning: an unmasked exception will cause an exception handler to be
/// called.
/// The standard handler will simply terminate the process. So, in this case
/// any underflow exception would terminate the current process with something
/// like `signal: 8, SIGFPE: erroneous arithmetic operation`.
///
/// ## Rounding Mode
///
/// The rounding mode is describe using two bits. It can be read and set using
/// the convenience wrappers `_MM_GET_ROUNDING_MODE()` and
/// `_MM_SET_ROUNDING_MODE(mode)`.
///
/// The rounding modes are:
///
/// * `_MM_ROUND_NEAREST`: (default) Round to closest to the infinite precision
/// value. If two values are equally close, round to even (i.e., least
/// significant bit will be zero).
///
/// * `_MM_ROUND_DOWN`: Round toward negative Infinity.
///
/// * `_MM_ROUND_UP`: Round toward positive Infinity.
///
/// * `_MM_ROUND_TOWARD_ZERO`: Round towards zero (truncate).
///
/// Example:
///
/// ```rust,ignore
/// _MM_SET_ROUNDING_MODE(_MM_ROUND_DOWN)
/// ```
///
/// ## Denormals-are-zero/Flush-to-zero Mode
///
/// If this bit is set, values that would be denormalized will be set to zero
/// instead. This is turned off by default.
///
/// You can read and enable/disable this mode via the helper functions
/// `_MM_GET_FLUSH_ZERO_MODE()` and `_MM_SET_FLUSH_ZERO_MODE()`:
///
/// ```rust,ignore
/// _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_OFF); // turn off (default)
/// _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); // turn on
/// ```
///
/// This function cannot be used safely as it causes undefined behavior in LLVM.
///
/// [Intel's documentation](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_setcsr)
GabrielMajeri marked this conversation as resolved.
Show resolved Hide resolved
#[inline]
#[target_feature(enable = "sse")]
#[cfg_attr(test, assert_instr(ldmxcsr))]
#[stable(feature = "simd_x86", since = "1.27.0")]
#[rustc_deprecated(
since = "1.59.0",
reason = "accessing floating point exception state causes undefined behavior in LLVM, see https://github.com/rust-lang/stdarch/issues/781"
)]
pub unsafe fn _mm_setcsr(val: u32) {
ldmxcsr(&val as *const _ as *const i8);
}
Expand Down Expand Up @@ -1591,9 +1477,13 @@ pub const _MM_FLUSH_ZERO_OFF: u32 = 0x0000;
///
/// [Intel's documentation](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_MM_GET_EXCEPTION_MASK)
#[inline]
#[allow(non_snake_case)]
#[allow(deprecated, non_snake_case)]
#[target_feature(enable = "sse")]
#[stable(feature = "simd_x86", since = "1.27.0")]
#[rustc_deprecated(
since = "1.59.0",
reason = "accessing floating point exception state causes undefined behavior in LLVM, see https://github.com/rust-lang/stdarch/issues/781"
)]
pub unsafe fn _MM_GET_EXCEPTION_MASK() -> u32 {
_mm_getcsr() & _MM_MASK_MASK
}
Expand All @@ -1602,9 +1492,13 @@ pub unsafe fn _MM_GET_EXCEPTION_MASK() -> u32 {
///
/// [Intel's documentation](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_MM_GET_EXCEPTION_STATE)
#[inline]
#[allow(non_snake_case)]
#[allow(deprecated, non_snake_case)]
#[target_feature(enable = "sse")]
#[stable(feature = "simd_x86", since = "1.27.0")]
#[rustc_deprecated(
since = "1.59.0",
reason = "accessing floating point exception state causes undefined behavior in LLVM, see https://github.com/rust-lang/stdarch/issues/781"
)]
pub unsafe fn _MM_GET_EXCEPTION_STATE() -> u32 {
_mm_getcsr() & _MM_EXCEPT_MASK
}
Expand All @@ -1613,9 +1507,13 @@ pub unsafe fn _MM_GET_EXCEPTION_STATE() -> u32 {
///
/// [Intel's documentation](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_MM_GET_FLUSH_ZERO_MODE)
#[inline]
#[allow(non_snake_case)]
#[allow(deprecated, non_snake_case)]
#[target_feature(enable = "sse")]
#[stable(feature = "simd_x86", since = "1.27.0")]
#[rustc_deprecated(
since = "1.59.0",
reason = "accessing floating point exception state causes undefined behavior in LLVM, see https://github.com/rust-lang/stdarch/issues/781"
)]
pub unsafe fn _MM_GET_FLUSH_ZERO_MODE() -> u32 {
_mm_getcsr() & _MM_FLUSH_ZERO_MASK
}
Expand All @@ -1624,9 +1522,13 @@ pub unsafe fn _MM_GET_FLUSH_ZERO_MODE() -> u32 {
///
/// [Intel's documentation](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_MM_GET_ROUNDING_MODE)
#[inline]
#[allow(non_snake_case)]
#[allow(deprecated, non_snake_case)]
#[target_feature(enable = "sse")]
#[stable(feature = "simd_x86", since = "1.27.0")]
#[rustc_deprecated(
since = "1.59.0",
reason = "accessing floating point exception state causes undefined behavior in LLVM, see https://github.com/rust-lang/stdarch/issues/781"
)]
pub unsafe fn _MM_GET_ROUNDING_MODE() -> u32 {
_mm_getcsr() & _MM_ROUND_MASK
}
Expand All @@ -1635,9 +1537,13 @@ pub unsafe fn _MM_GET_ROUNDING_MODE() -> u32 {
///
/// [Intel's documentation](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_MM_SET_EXCEPTION_MASK)
#[inline]
#[allow(non_snake_case)]
#[allow(deprecated, non_snake_case)]
#[target_feature(enable = "sse")]
#[stable(feature = "simd_x86", since = "1.27.0")]
#[rustc_deprecated(
since = "1.59.0",
reason = "accessing floating point exception state causes undefined behavior in LLVM, see https://github.com/rust-lang/stdarch/issues/781"
)]
pub unsafe fn _MM_SET_EXCEPTION_MASK(x: u32) {
_mm_setcsr((_mm_getcsr() & !_MM_MASK_MASK) | x)
}
Expand All @@ -1646,9 +1552,13 @@ pub unsafe fn _MM_SET_EXCEPTION_MASK(x: u32) {
///
/// [Intel's documentation](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_MM_SET_EXCEPTION_STATE)
#[inline]
#[allow(non_snake_case)]
#[allow(deprecated, non_snake_case)]
#[target_feature(enable = "sse")]
#[stable(feature = "simd_x86", since = "1.27.0")]
#[rustc_deprecated(
since = "1.59.0",
reason = "accessing floating point exception state causes undefined behavior in LLVM, see https://github.com/rust-lang/stdarch/issues/781"
)]
pub unsafe fn _MM_SET_EXCEPTION_STATE(x: u32) {
_mm_setcsr((_mm_getcsr() & !_MM_EXCEPT_MASK) | x)
}
Expand All @@ -1657,9 +1567,13 @@ pub unsafe fn _MM_SET_EXCEPTION_STATE(x: u32) {
///
/// [Intel's documentation](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_MM_SET_FLUSH_ZERO_MODE)
#[inline]
#[allow(non_snake_case)]
#[allow(deprecated, non_snake_case)]
#[target_feature(enable = "sse")]
#[stable(feature = "simd_x86", since = "1.27.0")]
#[rustc_deprecated(
since = "1.59.0",
reason = "accessing floating point exception state causes undefined behavior in LLVM, see https://github.com/rust-lang/stdarch/issues/781"
)]
pub unsafe fn _MM_SET_FLUSH_ZERO_MODE(x: u32) {
let val = (_mm_getcsr() & !_MM_FLUSH_ZERO_MASK) | x;
// println!("setting csr={:x}", val);
Expand All @@ -1670,9 +1584,13 @@ pub unsafe fn _MM_SET_FLUSH_ZERO_MODE(x: u32) {
///
/// [Intel's documentation](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_MM_SET_ROUNDING_MODE)
#[inline]
#[allow(non_snake_case)]
#[allow(deprecated, non_snake_case)]
#[target_feature(enable = "sse")]
#[stable(feature = "simd_x86", since = "1.27.0")]
#[rustc_deprecated(
since = "1.59.0",
reason = "accessing floating point exception state causes undefined behavior in LLVM, see https://github.com/rust-lang/stdarch/issues/781"
)]
pub unsafe fn _MM_SET_ROUNDING_MODE(x: u32) {
_mm_setcsr((_mm_getcsr() & !_MM_ROUND_MASK) | x)
}
Expand Down Expand Up @@ -3191,58 +3109,6 @@ mod tests {
_mm_sfence();
}

#[simd_test(enable = "sse")]
unsafe fn test_mm_getcsr_setcsr_1() {
let saved_csr = _mm_getcsr();

let a = _mm_setr_ps(1.1e-36, 0.0, 0.0, 1.0);
let b = _mm_setr_ps(0.001, 0.0, 0.0, 1.0);

_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
let r = _mm_mul_ps(*black_box(&a), *black_box(&b));

_mm_setcsr(saved_csr);

let exp = _mm_setr_ps(0.0, 0.0, 0.0, 1.0);
assert_eq_m128(r, exp); // first component is a denormalized f32
}

#[simd_test(enable = "sse")]
unsafe fn test_mm_getcsr_setcsr_2() {
// Same as _mm_setcsr_1 test, but with opposite flag value.

let saved_csr = _mm_getcsr();

let a = _mm_setr_ps(1.1e-36, 0.0, 0.0, 1.0);
let b = _mm_setr_ps(0.001, 0.0, 0.0, 1.0);

_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_OFF);
let r = _mm_mul_ps(*black_box(&a), *black_box(&b));

_mm_setcsr(saved_csr);

let exp = _mm_setr_ps(1.1e-39, 0.0, 0.0, 1.0);
assert_eq_m128(r, exp); // first component is a denormalized f32
}

#[simd_test(enable = "sse")]
unsafe fn test_mm_getcsr_setcsr_underflow() {
_MM_SET_EXCEPTION_STATE(0);

let a = _mm_setr_ps(1.1e-36, 0.0, 0.0, 1.0);
let b = _mm_setr_ps(1e-5, 0.0, 0.0, 1.0);

assert_eq!(_MM_GET_EXCEPTION_STATE(), 0); // just to be sure

let r = _mm_mul_ps(*black_box(&a), *black_box(&b));

let exp = _mm_setr_ps(1.1e-41, 0.0, 0.0, 1.0);
assert_eq_m128(r, exp);

let underflow = _MM_GET_EXCEPTION_STATE() & _MM_EXCEPT_UNDERFLOW != 0;
assert_eq!(underflow, true);
}

#[simd_test(enable = "sse")]
unsafe fn test_MM_TRANSPOSE4_PS() {
let mut a = _mm_setr_ps(1.0, 2.0, 3.0, 4.0);
Expand Down