Skip to content

v1.2.1

Compare
Choose a tag to compare
@github-actions github-actions released this 31 Jul 08:08
· 2702 commits to master since this release
527d364

CUDA v1.2.1

Diff since v1.2.0

Closed issues:

  • CuArrays.zeros(T, 0) fails (#81)
  • CUDAnative.cos calls the base cos function in nested broadcast (#102)
  • CuSparseMatrixHYB * CuMatrix = nothing (#256)
  • Strange reordering of struct fields with dynamic parallelism (#263)
  • Performance: bias add (#298)
  • CUDA 11 libraries incorrectly looked up in artifact (#300)
  • CUTENSOR for windows (#301)
  • Performance: sum (#302)
  • Performance: getindex(a, i::Array{Int}) (#303)
  • Display for CuArray within Tuples does not respect :limit=>true (#305)
  • Performance: elementwise operations (#307)
  • Performance: perceptron (#312)
  • windows install error: isfile(__libcupti[]) (#324)
  • std with dims is not type stable (#336)

Merged pull requests:

  • Re-enable threading tests. (#25) (@maleadt)
  • Reorganize and simplify some includes (#296) (@maleadt)
  • Only run benchmarks on the master branch. (#297) (@maleadt)
  • Optimizations for broadcast (#299) (@maleadt)
  • Update manifest (#304) (@github-actions[bot])
  • Test runner improvements for multigpu mode (#309) (@maleadt)
  • Artifact improvements for CUDA 11 on Windows (#310) (@maleadt)
  • Optimize element-wise operations (#313) (@maleadt)
  • Check if reported GPU memory use is available. (#314) (@maleadt)
  • Update artifacts: include cusolverMg, and use Yggdrasil binaries. (#315) (@maleadt)
  • Specialization fixes for mapreducedim. (#316) (@maleadt)
  • Fix invalid conversion of pointer to signed integer. (#317) (@maleadt)
  • Work around (presumed) Windows driver bug in exception test. (#319) (@maleadt)
  • Update manifest (#323) (@github-actions[bot])
  • Bump CUDNN and CUTENSOR (#325) (@maleadt)
  • Simplify NVML discovery. (#326) (@maleadt)
  • Separate CURAND wrappers from Random impl. (#327) (@maleadt)
  • Simplify discovering binaries by using Sys.which. (#328) (@maleadt)
  • Add wrapper for NVML utilization rates. (#329) (@maleadt)
  • Attach CUSPARSE docstrings to bare methods, not empty functions. (#331) (@maleadt)
  • Eagerly reduce the amount of worker threads. (#332) (@maleadt)
  • Bump dependencies. (#333) (@maleadt)
  • Clean-up library wrappers [NFC] (#334) (@maleadt)
  • Fix CUDNN v8 discovery and loading on Windows (#335) (@maleadt)
  • Fix type stability of Statistics.var with dims. (#337) (@maleadt)
  • Fix parameter alignment for dynamic parallelism. (#338) (@maleadt)
  • Micro-optimize Base.fill. (#339) (@maleadt)