-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
partr thread support #105
partr thread support #105
Conversation
Hmm, AppVeyor is crashing on x64 with:
The AppVeyor tests are single threads ( I can only speculate that something went wrong in the x64 BinaryBuilder cross-compile with the latest build tools… |
Is fixing #66 (thread-safe planner) relevant for this PR? |
@ViralBShah, this PR fixes #66, which seems like a good idea for using FFTW in an environment with more pervasive threading but is not strictly required for partr support. |
Switched to the new build from JuliaPackaging/Yggdrasil#53, which uses different compiler versions, thanks to @staticfloat. Let's see if that helps. |
Sorry, I had a typo in the |
…t__) and explicitly call set_num_threads(1) for no-threads check since multithreaded Julia now uses multiple FFTW threads by default
Convolutions in DSP currently rely on FFTW.jl, and a recent change in FFTW.jl (JuliaMath/FFTW.jl#105) has introduced a large performance regression in `conv` whenever Julia is started with more than one thread. Since v1 of FFTW.jl, it uses multi-threaded FFTW transformations by default whenever Julia has more than one thread. This new default causes small FFT problems to run much more slowly and use much more memory. Since the overlap-save method of `conv` in DSP breaks a convolutions into small convolutions, and therefore performs a large number of small FFTW transformations, this change can cause convolutions to be slower by two orders of magnitude, and similarly use two orders of magnitude more memory. While FFTW.jl does not provide an explicit way to set the number of threads used by a FFTW plan without changing a global variable, generating the plans with the planning flag set to `FFTW.PATIENT` (instead of the default `MEASURE`) allows the planner to consider changing the number of threads. Adding this flag to the plans generated by the overlap-save convolution method seems to rescue the performance regression on multi-threaded instances of Julia. Fixes JuliaDSP#399 Also see JuliaMath/FFTW.jl#121
Support partr threads (JuliaLang/julia#31398) on the latest Julia master branch via FFTW/fftw3#175.
When it is using the partr backend, by default it sets the number of FFTW "threads" to 4*nthreads — we want to spawn more tasks than we have threads to help with load balancing if other stuff is running.
It also enables the thread-safe FFTW planner (which puts a mutex lock around plan creation). In the longer run, it would be better to do the locking on the Julia side, since presumably using a Julia lock would allow other Julia tasks to wake up. Closes #66.
To get the full benefit of threading, you should precompute the FFT plan via
p = plan_fft(array)
[orp = plan_fft(array, flags=FFTW.MEASURE)
orp = plan_fft(array, flags=FFTW.PATIENT)
if you want it to self-optimize the plan], rather than callingfft(array)
directly.cc @vtjnash, who tested this at JuliaCon. I think I included all of the fixes we did on your machine.