v5.4.3
CUDA v5.4.3
Merged pull requests:
- add cublasgetrsBatched (#2385) (@bjarthur)
- add two quirks for rationals (#2403) (@lanceXwq)
- Bump cuDNN (#2404) (@maleadt)
- Add convert method for ScaledPlan (#2409) (@david-macmahon)
- Conditionalize a quirk. (#2411) (@maleadt)
- Relax signature of generic matvecmul! (#2414) (@dkarrasch)
- Fix kron launch configuration. (#2418) (@maleadt)
- Run full GC when under very high memory pressure. (#2421) (@maleadt)
- Enzyme: Fix cuarray return type (#2425) (@wsmoses)
- CompatHelper: bump compat for LLVM to 8, (keep existing compat) (#2426) (@github-actions[bot])
- pre-allocated pivot and info buffers for getrf_batched (#2431) (@bjarthur)
- Profiler tweaks. (#2432) (@maleadt)
- Update the Julia wrappers for CUDA v12.5.1 (#2436) (@amontoison)
- Correct workspace handling (#2437) (@maleadt)
Closed issues:
- Legacy cuIpc* APIs incompatible with stream-ordered allocator (#1053)
- Broadcasted multiplication with a rational doesn't work (#1926)
- Incorrect grid size in
kron
(#2410) - GEMM of non-contiguous inputs should dispatch to fallback implementation (#2412)
- Failure of Eigenvalue Decomposition for Large Matrices. (#2413)
- CUDA_Driver_jll's lazy artifacts cause a precompilation-time warning (#2415)
- Recurrence of integer overflow bug (#1880) for a large matrix (#2427)
- CUDA kernel crash very occasionally when MPI.jl is just loaded. (#2429)
- CUDA_Runtime_Discovery Did not find cupti on Arm system with nvhpc (#2433)
- CUDA.jl won't install/run on Jetson Orin NX (#2435)