Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ARIMA - Kalman loop rewrite: single megakernel instead of host loop (r…
…apidsai#4006) This PR brings **speedups** of the order of **10x** to seasonal ARIMA. It replaces a legacy host loop based on cuBLAS batched operations and RAPIDS prims, with a custom kernel, reducing launch overheads and unnecessary reads and writes in global memory. On top of that, it paves the way for support for missing observations. The PR introduces a set of prims in `linalg/block.cuh` to compute block-local linear algebra operations, and corresponding unit tests in `test/prims/linalg_block.cu`. Authors: - Louis Sugy (https://github.com/Nyrio) Approvers: - Tamas Bela Feher (https://github.com/tfeher) - Robert Maynard (https://github.com/robertmaynard) URL: rapidsai#4006
- Loading branch information