Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature detection is not thread-safe #560

Open
Andreas-Krebbel opened this issue Jul 4, 2024 · 2 comments
Open

Feature detection is not thread-safe #560

Andreas-Krebbel opened this issue Jul 4, 2024 · 2 comments

Comments

@Andreas-Krebbel
Copy link
Contributor

On S/390 I ran into problem with the way the VXE2 hardware feature is currently being detected in multi-threaded code. It uses a SIGILL signal handler and sigjmp/longjmp in order to detect whether a certain instruction is available or not. Although I've looked into it for S/390 this should apply to other platforms using the same mechanism.

The problem is triggered by the way PyTorch is using Sleef:
pytorch/pytorch#128503

But it can also be reproduced with a small example like this:

#include <stdio.h>
#include "sleef.h"

#ifndef NUM_THREADS
#define NUM_THREADS 4
#endif

int main() {
  __vector float out[NUM_THREADS];

#pragma omp parallel for num_threads(NUM_THREADS)
  for (int i = 0; i < NUM_THREADS; i++)
    out[i] = Sleef_expf4_u10 ((__vector float){1.0f + i, 2.0f + i , 3.0f + i, 4.0f + i});

  for (int i = 0; i < NUM_THREADS;i++)
    {
      for (int j = 0; j < 4; j++)
	printf("%6.3f ", out[i][j]);
      printf("\n");
    }
  printf("\n");
}

Running the test like this:
gcc -DNUM_THREADS=4 t.c -O3 -mzvector -march=z15 -lsleef -fopenmp -lgomp -o t && ./t
results in either broken results or crashes

While the single threaded version works fine:
gcc -DNUM_THREADS=1 t.c -O3 -mzvector -march=z15 -lsleef -fopenmp -lgomp -o t && ./t

The cpuSupportsExt function uses the file scope variable sigjmp to store the execution status what makes this function thread-unsafe.

I will send a PR to check HWCAPs instead of using the signal handler. This fixes the problem for S/390. I think other platforms might need similar adjustments.

Andreas-Krebbel added a commit to Andreas-Krebbel/sleef that referenced this issue Jul 4, 2024
@blapie
Copy link
Collaborator

blapie commented Jul 9, 2024

Sorry for the late reply and thank you very much for the detailed explanation of the issue, providing a reproducer as well as suggesting a fix and PR. Will look into your PR shortly!

@blapie blapie closed this as completed in 58ca089 Jul 26, 2024
@blapie
Copy link
Collaborator

blapie commented Jul 26, 2024

Re-opening the issue, as #560 only fixed the issue on a single architecture (s/390) and a single OS (linux).
#563 attempts to fix other OS-es
Yet to implement a thread-safe detection for other architecture.

@blapie blapie reopened this Jul 26, 2024
blapie pushed a commit to blapie/sleef that referenced this issue Aug 2, 2024
blapie pushed a commit to blapie/sleef that referenced this issue Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants