Skip to content

Latest commit

 

History

History
75 lines (46 loc) · 3.29 KB

README.md

File metadata and controls

75 lines (46 loc) · 3.29 KB

ArgMinMax

Efficient argmin & argmax (in 1 function) with SIMD (SSE, AVX(2), AVX512, NEON) for f16, f32, f64, i8, i16, i32, i64, u8, u16, u32, u64.

🚀 The function is generic over the type of the array, so it can be used on &[T] or Vec<T> where T can be f161, f32, f64, i8, i16, i32, i64, u8, u16, u32, u64.

🤝 The trait is implemented for slice, Vec, 1D ndarray::ArrayBase2, and apache arrow::PrimitiveArray3.

Runtime CPU feature detection is used to select the most efficient implementation for the current CPU. This means that the same binary can be used on different CPUs without recompilation.

👀 The SIMD implementation contains no if checks, ensuring that the runtime of the function is independent of the input data its order (best-case = worst-case = average-case).

🪄 Efficient support for f16 and uints: through (bijective aka symmetric) bitwise operations, f16 (optional1) and uints are converted to ordered integers, allowing to use integer SIMD instructions.

1 for f16 you should enable the "half" feature.
2 for ndarray::ArrayBase you should enable the "ndarray" feature.
3 for arrow::PrimitiveArray you should enable the "arrow" feature.

Installing

Add the following to your Cargo.toml:

[dependencies]
argminmax = "0.4"

Example usage

use argminmax::ArgMinMax;  // import trait

let arr: Vec<i32> = (0..200_000).collect();  // create a vector

let (min, max) = arr.argminmax();  // apply extension

println!("min: {}, max: {}", min, max);
println!("arr[min]: {}, arr[max]: {}", arr[min], arr[max]);

Features

  • "half": support f16 argminmax (through using the half crate).
  • "ndarray": add ArgMinMax trait to ndarray its Array1 & ArrayView1.

Benchmarks

Benchmarks on my laptop (AMD Ryzen 7 4800U, 1.8 GHz, 16GB RAM) using criterion show that the function is 3-20x faster than the scalar implementation (depending of data type).

See /benches/results.

Run the benchmarks yourself with the following command:

cargo bench --quiet --message-format=short --features half | grep "time:"

Tests

To run the tests use the following command:

cargo test --message-format=short --all-features

Limitations

❗ Does not support NaNs.


Acknowledgements

Some parts of this library are inspired by the great work of minimalrust's argmm project.