Release v5.4.3 · JuliaGPU/CUDA.jl

CUDA v5.4.3

Merged pull requests:

add cublasgetrsBatched (#2385) (@bjarthur)
add two quirks for rationals (#2403) (@lanceXwq)
Bump cuDNN (#2404) (@maleadt)
Add convert method for ScaledPlan (#2409) (@david-macmahon)
Conditionalize a quirk. (#2411) (@maleadt)
Relax signature of generic matvecmul! (#2414) (@dkarrasch)
Fix kron launch configuration. (#2418) (@maleadt)
Run full GC when under very high memory pressure. (#2421) (@maleadt)
Enzyme: Fix cuarray return type (#2425) (@wsmoses)
CompatHelper: bump compat for LLVM to 8, (keep existing compat) (#2426) (@github-actions[bot])
pre-allocated pivot and info buffers for getrf_batched (#2431) (@bjarthur)
Profiler tweaks. (#2432) (@maleadt)
Update the Julia wrappers for CUDA v12.5.1 (#2436) (@amontoison)
Correct workspace handling (#2437) (@maleadt)

Closed issues:

Legacy cuIpc* APIs incompatible with stream-ordered allocator (#1053)
Broadcasted multiplication with a rational doesn't work (#1926)
Incorrect grid size in kron (#2410)
GEMM of non-contiguous inputs should dispatch to fallback implementation (#2412)
Failure of Eigenvalue Decomposition for Large Matrices. (#2413)
CUDA_Driver_jll's lazy artifacts cause a precompilation-time warning (#2415)
Recurrence of integer overflow bug (#1880) for a large matrix (#2427)
CUDA kernel crash very occasionally when MPI.jl is just loaded. (#2429)
CUDA_Runtime_Discovery Did not find cupti on Arm system with nvhpc (#2433)
CUDA.jl won't install/run on Jetson Orin NX (#2435)