partr thread support #105

stevengj · 2019-07-30T19:23:33Z

Support partr threads (JuliaLang/julia#31398) on the latest Julia master branch via FFTW/fftw3#175.

When it is using the partr backend, by default it sets the number of FFTW "threads" to 4*nthreads — we want to spawn more tasks than we have threads to help with load balancing if other stuff is running.

It also enables the thread-safe FFTW planner (which puts a mutex lock around plan creation). In the longer run, it would be better to do the locking on the Julia side, since presumably using a Julia lock would allow other Julia tasks to wake up. Closes #66.

To get the full benefit of threading, you should precompute the FFT plan via p = plan_fft(array) [or p = plan_fft(array, flags=FFTW.MEASURE) or p = plan_fft(array, flags=FFTW.PATIENT) if you want it to self-optimize the plan], rather than calling fft(array) directly.

cc @vtjnash, who tested this at JuliaCon. I think I included all of the fixes we did on your machine.

stevengj · 2019-07-30T19:44:18Z

Hmm, AppVeyor is crashing on x64 with:

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x1ed37c3b -- VMOVAPD_LD at C:\projects\fftw-jl\deps\usr\bin\libfftw3-3.dll (unknown line)
in expression starting at C:\projects\fftw-jl\test\runtests.jl:10
VMOVAPD_LD at C:\projects\fftw-jl\deps\usr\bin\libfftw3-3.dll (unknown line)
hc2cfdftv_20 at C:\projects\fftw-jl\deps\usr\bin\libfftw3-3.dll (unknown line)
apply_extra_iter at C:\projects\fftw-jl\deps\usr\bin\libfftw3-3.dll (unknown line)
apply_dit_dft at C:\projects\fftw-jl\deps\usr\bin\libfftw3-3.dll (unknown line)
.text at C:\projects\fftw-jl\deps\usr\bin\libfftw3-3.dll (unknown line)
unsafe_execute! at C:\projects\fftw-jl\src\fft.jl:407 [inlined]
* at C:\projects\fftw-jl\src\fft.jl:729

The AppVeyor tests are single threads (JULIA_NUM_THREADS should be 1, the default), so it is not using any of the new partr code in this PR.

I can only speculate that something went wrong in the x64 BinaryBuilder cross-compile with the latest build tools…

ViralBShah · 2019-08-04T19:40:20Z

Is fixing #66 (thread-safe planner) relevant for this PR?

stevengj · 2019-08-17T23:42:30Z

@ViralBShah, this PR fixes #66, which seems like a good idea for using FFTW in an environment with more pervasive threading but is not strictly required for partr support.

stevengj · 2019-09-05T01:51:52Z

Switched to the new build from JuliaPackaging/Yggdrasil#53, which uses different compiler versions, thanks to @staticfloat. Let's see if that helps.

staticfloat · 2019-09-05T05:41:44Z

Sorry, I had a typo in the build.jl I gave you. Fixed now.

…t__) and explicitly call set_num_threads(1) for no-threads check since multithreaded Julia now uses multiple FFTW threads by default

Convolutions in DSP currently rely on FFTW.jl, and a recent change in FFTW.jl (JuliaMath/FFTW.jl#105) has introduced a large performance regression in `conv` whenever Julia is started with more than one thread. Since v1 of FFTW.jl, it uses multi-threaded FFTW transformations by default whenever Julia has more than one thread. This new default causes small FFT problems to run much more slowly and use much more memory. Since the overlap-save method of `conv` in DSP breaks a convolutions into small convolutions, and therefore performs a large number of small FFTW transformations, this change can cause convolutions to be slower by two orders of magnitude, and similarly use two orders of magnitude more memory. While FFTW.jl does not provide an explicit way to set the number of threads used by a FFTW plan without changing a global variable, generating the plans with the planning flag set to `FFTW.PATIENT` (instead of the default `MEASURE`) allows the planner to consider changing the number of threads. Adding this flag to the plans generated by the overlap-save convolution method seems to rescue the performance regression on multi-threaded instances of Julia. Fixes JuliaDSP#399 Also see JuliaMath/FFTW.jl#121

partr thread support

09ff744

stevengj mentioned this pull request Jul 30, 2019

build 3.3.9-alpha1 JuliaMath/FFTWBuilder#1

Merged

This was referenced Aug 4, 2019

partr thread support for openblas JuliaLang/julia#32786

Closed

Enable FFTW threading by default (to match up to performance of octave and others) JuliaLang/julia#17000

Closed

antoine-levitt mentioned this pull request Aug 5, 2019

Threading JuliaMolSim/DFTK.jl#15

Closed

stevengj mentioned this pull request Aug 31, 2019

Pkg.add("FFTW") fails in Windows Julia 1.0.3 #84

Closed

switch to Yggdrasil build

37cb9c6

stevengj referenced this pull request Sep 5, 2019

Update build_fftw.jl

a6e1d32

stevengj force-pushed the partr branch from fe8c19e to 37cb9c6 Compare September 5, 2019 01:57

Fix problem in build_fftw.jl

8ffd30a

move threads check to end (since threads are now initialized in __ini…

ef1019c

…t__) and explicitly call set_num_threads(1) for no-threads check since multithreaded Julia now uses multiple FFTW threads by default

stevengj merged commit 527d076 into master Sep 5, 2019

stevengj deleted the partr branch September 5, 2019 13:09

stevengj mentioned this pull request Sep 11, 2019

Add FFTW builder JuliaPackaging/Yggdrasil#53

Merged

tknopp mentioned this pull request Nov 19, 2019

Multithreading JuliaMath/NFFT.jl#19

Closed

stevengj mentioned this pull request Jan 16, 2020

slowdown in threaded code from julia 1.2 to julia 1.4-DEV #121

Closed

antoine-levitt mentioned this pull request Jan 24, 2020

Thread safety #134

Open

chrisbrahms mentioned this pull request Apr 27, 2020

FFTW wisdom is not thread safe? LupoLab/Luna.jl#160

Closed

galenlynch mentioned this pull request May 24, 2020

Workaround for performance regression introduced by FFTW JuliaDSP/DSP.jl#362

Closed

glwagner mentioned this pull request Oct 29, 2020

When multithreading use 4 times more threads for FFTW CliMA/Oceananigans.jl#1113

Closed

navidcy mentioned this pull request Nov 16, 2020

FFTW can use 4x available threads FourierFlows/FourierFlows.jl#223

Closed

chriselrod mentioned this pull request Jan 23, 2022

Polyester makes FFTW slower JuliaSIMD/Polyester.jl#63

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

partr thread support #105

partr thread support #105

stevengj commented Jul 30, 2019 •

edited

Loading

stevengj commented Jul 30, 2019

ViralBShah commented Aug 4, 2019 •

edited

Loading

stevengj commented Aug 17, 2019

stevengj commented Sep 5, 2019

staticfloat commented Sep 5, 2019

partr thread support #105

partr thread support #105

Conversation

stevengj commented Jul 30, 2019 • edited Loading

stevengj commented Jul 30, 2019

ViralBShah commented Aug 4, 2019 • edited Loading

stevengj commented Aug 17, 2019

stevengj commented Sep 5, 2019

staticfloat commented Sep 5, 2019

stevengj commented Jul 30, 2019 •

edited

Loading

ViralBShah commented Aug 4, 2019 •

edited

Loading