-
Notifications
You must be signed in to change notification settings - Fork 80
Parallelization
kaskr edited this page Jun 25, 2016
·
5 revisions
TMB uses the following BLAS kernels when calculating function value and derivatives
Function | Gradient |
---|---|
dgemm | dgemm |
dsyrk | dsymm |
dtrsm | dtrsm |
dpotrf | dpotri |
If your model spends a significant amount of time in these BLAS operations you may benefit from an optimized BLAS library e.g. MKL or OpenBLAS for CPU or nvblas for GPU. For a good result it's critical that
- All required BLAS kernels are part of the library (currently not the case for nvblas ? ).
- The library should not add significant overhead for small matrices (OPENBLAS have had problems - is it still the case ? ).