You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We could consider using this for accelerating LSWT. Note that for many LSWT calculations, especially in dipole mode, the diagonalization subroutine itself may not be the dominant cost. To make this beneficial, we would probably need to move a lot of the calculation onto the GPU (e.g., the matrix-builds for each q).
The text was updated successfully, but these errors were encountered:
CuSolve provides a function to perform batched diagonalization of Hermitian matrices:
https://docs.nvidia.com/cuda/cusolver/index.html#cusolverdn-t-syevj
Performance benefits may depend a lot on matrix size, etc: https://discourse.julialang.org/t/eigenvalues-for-lots-of-small-matrices-gpu-batched-vs-cpu-eigen/50792
We could consider using this for accelerating LSWT. Note that for many LSWT calculations, especially in dipole mode, the diagonalization subroutine itself may not be the dominant cost. To make this beneficial, we would probably need to move a lot of the calculation onto the GPU (e.g., the matrix-builds for each q).
The text was updated successfully, but these errors were encountered: