Using OpenACC #341
ImanHosseini
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have a suggestion: using OpenACC. OpenACC would support all 3 of simd, multicore and gpu and it is very convenient. I see that there is already cuda implementation of the kernels: OpenACC has CUDA interop, so the code before and after those cuda kernels can also be offloaded to GPU with OpenACC.
eval_ker_expts_libin_simd64
As a demonstration, I looked at 'eval_ker_expts_libin_simd64': rewriting the openmp loop with openacc is as easy as: (file: https://gist.github.com/ImanHosseini/c3ffc350c25d746cf8d524e6ede94573)
The nice thing about this, is that the same code can be compiled for simd, multicore or gpu, and all it took was just 1 "acc kernels" pragma.
On my machine (RTX 3060 Ti, 14-core i9-9940X):
Fortran
As a 2nd example, I took a fortran example: nufft1d_demo and changed that to use fortran doconcurrent and openacc (for 'errcomp') i.e.:
and:
and:
Compiled as:
Result:
Beta Was this translation helpful? Give feedback.
All reactions