avx: AVX1 support for matrix inverse #64
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
cglm already supports AVX version for mat4_mul, but mat4_inv was missing. I implemented AVX1 version of matrix inverse.
After upgraded my Macbook Pro I'll try to implement AVX2 + FMA too, but since my current CPU does not support that, I can't do that for now.
I tested mat4_inv on Ivy Bridge CPU, I got similar performance with SSE (not better), but on new CPUs the result may be different. I'll try to reduce some shuffles later to increase performance.
New functions:
glm_mat4_scale_avx(mat4 m, float s)
glm_mat4_inv_avx(mat4 mat, mat4 dest)
These are selected automatically if -mavx is set.
I'll try to optimize SIMD-ed functions with SSE3 and SSE4 later.