Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

separate SSE and SSE2 #412

Merged
merged 10 commits into from
Apr 1, 2024
Merged

separate SSE and SSE2 #412

merged 10 commits into from
Apr 1, 2024

Conversation

recp
Copy link
Owner

@recp recp commented Mar 31, 2024

Separate SSE and SSE2 where SSE2 may not be available e.g i686.

  • separate SSE and SSE2
  • make Fasth math work both on SSE and SSE2, make -0.0f run on SSE + fast math.
  • make some failing test pass when fast math is on
  • provide CGLM_FAST_MATH to identify path math
  • tests: dont validate nan and inf on fast math

Most operations are SSE floating point but some SSE2 intrinsics are available for instance:

  • _mm_castsi128_ps
  • _mm_shuffle_epi32

I tried to find some of them and fallback SSE FP domain if possible.

Github action that only works on SSE would awesome actually to identify issues.

READY TO MERGE 🚀

@recp
Copy link
Owner Author

recp commented Apr 1, 2024

Big problem here:

Setting -0.0f is a big issue without SSE2 e.g. SSE + Fasth math which will ignore the sign probably. Also passing 0x80000000 directly to float seems not work.

@recp
Copy link
Owner Author

recp commented Apr 1, 2024

This seems worked:

#if defined(__SSE2__)
#  define GLMM_NEGZEROf ((int)0x80000000) /*  0x80000000 ---> -0.0f  */
#  define GLMM_POSZEROf ((int)0x00000000) /*  0x00000000 ---> +0.0f  */
#else
#  ifdef CGLM_FAST_MATH
     union { int i; float f; } static GLMM_NEGZEROf_TU = { .i = (int)0x80000000 };
#    define GLMM_NEGZEROf GLMM_NEGZEROf_TU.f
#    define GLMM_POSZEROf 0.0f
#  else
#    define GLMM_NEGZEROf -0.0f
#    define GLMM_POSZEROf  0.0f
#  endif
#endif

it defines GLMM_NEGZEROf_TU for each translation unit but better approaches can be considered if available

@recp recp merged commit 1796cc5 into master Apr 1, 2024
77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant