Skip to content

Releases: IntelPython/numba-dpex

0.23.0

28 May 17:26
46e90f6
Compare
Choose a tag to compare

Fixed

  • Array alignment problem for stack arrays allocated for kernel arguments. (#1357)
  • Issue #892, #906 caused by incorrect code generation for indexing (#1377)
  • Generation of KernelHasReturnValueError error inside KernelDispatcher. (#1394)
  • Issue #1390: broken support for slicing into dpctl.tensor.usm_ndarray in kernels (#1425)
  • Support for Wheels package on Windows (#1430)
  • Incorrect mangled name for kernel function arguments (#1443)
  • Remove artifacts from conda/wheel packages residing in root level (#1450)
  • GDB tests to work properly on Intel Max GPU (#1451)
  • Improper wheels installation on unsupported platforms (#1452)
  • Ref-counting of Python object temporaries in unboxing code (#1454)
  • Segfault caused by using malloc to allocate NRT_MemInfo. Replaced with Numba's NRT alloc (#1458)
  • Incorrect package name in README.md (#1463)

Added

  • A new overloaded dimensions attribute for all index-space id classes (#1359)
  • Support for AtomicRef creation using multi-dimensional arrays (#1367)
  • Support for linearized indexing functions inside a JIT compiled kernel (#1368)
  • Improved documentation: overview (#1341), kernel programming guide (#1388), API docs (#1414), configs options (#1415), comparison with SYCL API (#1417)
  • New PrivateArray class in kernel_api to replace dpex.private.array (#1370, #1377)
  • Support for libsycinterface::DPCTLKernelArgType enum for specifying type of kernel args instead of hard coding (#1382)
  • New indexing unit tests for kernel_api simulator and JIT compiled modes (#1378)
  • New unit tests to verify all kernel_api features usable inside device_func (#1391)
  • A sycl::local_accessor-like API (kernel_api.LocalAccessor) for numba-dpex kernel (#1331)
  • Specialization support for device_func decorator (#1398)
  • Support for all kernel_api functions inside the numba_dpex.kernel decorator. (#1400)
  • Support for dpnp 0.15 (#1434, #1464)
  • Improvements to pyproject.toml configs to build numba-dpex from source. (#1449)
  • Load the SPV_INTEL_variable_length_array SPIR-V extension to supporting arrays in private address-space on Intel Max GPU. (#1451)

Changed

  • Default inline threshold value set to 2 from None. (#1385)
  • Port parfor kernel templates to kernel_api (#1416), (#1424)
  • Use SPIRVKernelDispatcher for parfor kernel dispatch (#1435, #1448)
  • All examples use the latest dpctl API (#1431)
  • Minimum required dpctl version is now 0.16.1
  • Minimum required numba version is now 0.59.0 (#1462)

Removed

  • OpenCL-like kernel API functions (#1420)
  • func decorator (replaced by device_func) (#1400)
  • numba_dpex.experimental.kernel and numba_dpex.experimental.device_func (#1400)

New Contributors

0.22.1

20 Feb 16:44
0c3fb55
Compare
Choose a tag to compare

Fixed

  • Bug in boxing a DpnpNdArray from parent (#1155)
  • Strided layouts and F-contiguous layouts supported in experimental kernel (#1178)
  • Barrier call code-generation on OpenCL CPU devices (#1280, #1310)
  • Importing numba-dpex can break numba execution (#1267)
  • Overhead on launching numba_dpex.kernel functions (#1236)

Added

  • Support for dpctl.SyclEvent data type inside dpjit (#1134)
  • Support for kernel_api.Range and kernel_api.NdRange inside dpjit (#1148)
  • DPEX_OPT: a numba-dpex-specific optimization level config option (#1158)
  • Uploading wheels packages to anaconda (#1160)
  • flake8 eradicate linter option (#1177)
  • Support dpctl.SyclEvent.wait call inside dpjit (#1179)
  • Creation of sycl event and queue inside dpjit (#1193, #1190, #1218)
  • Experimental kernel dispatcher for kernel compilation (#1178, #1205)
  • Added experimental target context for SPIRV codegen (#1213, #1225)
  • GDB test cases in public CI (#1209)
  • Async kernel submission option (#1219, #1249)
  • A new literal type to store IntEnum as Literal types (#1227)
  • SYCL-like memory enum classes to the experimental module (#1239)
  • call_kernel function to launch kernels (#1260)
  • Experimental overloads for an AtomicRef class and fetch_* methods (#1257, #1261)
  • New device-specific USMNdArrayModel for USMNdArray and DpnpNdArray types (#1293)
  • Experimental atomic load, store and exchange operations (#1297)
  • Kernel_api module to simulate kernel functions in pure Python (#1304, #1326)
  • Experimental implementation of group barrier operation (#1280)
  • Experimental atomic compare_exchange implementation (#1312)
  • Experimental group index class (#1310)
  • OpenSSF scorecard (#1320)
  • Experimental feature index overload methods (#1323)
  • Experimental feature group index overload methods (#1330)
  • API Documentation for kernel API (#1332)

Changed

  • Switch to dpc++ compiler for building numba-dpex (#1210)
  • Versioneer and pytest configs into pyproject.toml (#1212)
  • numba-dpex can be imported even if no SYCL device is detected by dpctl (#1272)

Removed

  • Kernel launch params as lists/tuple. Only Range/NdRange supported (#1251)
  • DEFAULT_LOCAL_SIZE global constant (#1291)
  • Functions to invoke spirv-tools utilities from spirv_generator (#1292)
  • Incomplete vectorize decorator from numba-dpex (#1298)
  • Support for Numba 0.57 (#1307)

Deprecated

  • OpenCL-like kernel API fucntions in numba_dpex.ocldecl module

0.21.4

12 Oct 21:37
8e449c9
Compare
Choose a tag to compare

Fixed

  • Remove dead code to silence Coverity errors. (#1163)

0.21.3

28 Sep 19:19
5aaf492
Compare
Choose a tag to compare

Fixed

  • Pin CI conda channels (#1133)
  • Mangled kernel name generation (#1112)

Added

Changed

  • Rename helper function to clearly indicate its usage (#1145)
  • The data model used by the DpnpNdArray type for kernel functions(#1118)

Removed

  • Support for Python 3.8 (#1113)

0.21.2

16 Aug 16:10
4d332e8
Compare
Choose a tag to compare

Fixed

  • Bugs (#1068, #774) in atomic addition caused due to improper floating point atomic emulation. (#1103)

Changed

  • Updated documentation and user guides (#1097, #879)

Removed

  • Dependency on spirv-tools (#1103, #1108)
  • floating point atomic add emulation using atomic_ops.cl (#1103)
  • NUMBA_DPEX_ACTIVATE_ATOMICS_FP_NATIVE configuration option (#1103)

0.21.1

19 Jul 15:17
a421647
Compare
Choose a tag to compare

Changed

  • Improved support for queue keyword in dpnp array constructor overloads (#1083)
  • Improved reduction kernel example (#1089)

Fixed

  • Update Itanium CXX ABI Mangler reference (#1080)
  • Update sourceware references in docstrings (#1081)
  • Typo in error messages of kernel interface (#1082)

0.21.0 Clean House

17 Jun 07:28
c37e360
Compare
Choose a tag to compare

Added

  • Support addition and multiplication-based prange reduction loops (#999)
  • Proper boxing, unboxing of dpctl.SyclQueue objects inside dpjit decorated functions (#963, #1064)
  • Support for queue keyword arguments inside dpnp array constructors in dpjit (#1032)
  • Overloads for dpnp array constructors: dpnp.full (#991), dpnp.full_like (#997)
  • Support for complex64 and complex128 types as kernel arguments and in parfors (#1033, #1035)
  • New config to run the ConstantSizeStaticLocalMemoryPass optionally (#999)
  • Support for Numba 0.57 (#1030, #1003, #1002)
  • Support for Python 3.11 (#1054)
  • Support for SPIRV 1.4 (#1056, #1060)

Changed

  • Parfor lowering happens using the kernel pipeline (#996)
  • Minimum required Numba version is now 0.57 (#1030)
  • Numba monkey patches are moved to numba_dpex.numba_patches (#1030)
  • Redesigned unit test suite (#1018, #1017, #1015, #1036, #1037, #1072)

Fixed

  • Fix stride computation when unboxing a dpnp array (#1023)
  • Using cached queue instead of creating new one on type inference (#946)
  • Fixed bug in reduction mul operation for dpjit (#1048)
  • Offload of parfor nodes to OpenCL UHD GPU devices (#1074)

Removed

  • Support for offloading NumPy-based parfor nodes to SYCL devices (#1041)
  • Removed rename_numpy_functions_pass (#1041)
  • Dpnp overloads using stubs (#1041, #1025)
  • Support for like keyword argument in dpnp array constructor overloads (#1043)
  • Support for NumPy arrays as kernel arguments (#1049)
  • Kernel argument access specifiers (#1049)
  • Support for dpctl.device_context to launch kernels and njit offloading (#1041)

0.20.1

08 Apr 06:55
dc25504
Compare
Choose a tag to compare

Added

*Replaced llvm_spirv from oneAPI path by dpcpp-llvm-spirv package.(#979)
*Added Dockerfile and a manual workflow to publish pre-built packages to the repo.(#973)

Fixed

*Fixed default dtype derivation when creating a dpnp.ndarray. (#993)
*Adjusted test_windows step to work with intel-opencl-rt=2023.1.0. (#990)
*Fixed layout in dpnp overload.(#987)
*Handled the case when arraystruct->meminfo is null to close gh-965. (#972)

0.20.0 Phoenix

11 Mar 21:22
Compare
Choose a tag to compare

Added

  • New dpjit decorator supporting dpnp compilation (#887)
  • Boxing and unboxing functionality for dpnp.ndarray to numba_dpex (#902)
  • New DpexTarget and dispatcher for compiling dpnp using numba-dpex (#887)
  • Overload implementation for dpnp.empty (#902)
  • Overload implementation for dpnp.empty_like, dpnp.zeros_like and
    dpnp.ones_like inside dpjit (#928)
  • Overload implementation for dpnp.zeros and dpnp.ones inside dpjit (#923)
  • Compilation and offload support for dpnp vector style expressions using Numba
    parfors (#957)
  • Compilation of over 70 ufuncs for dpnp inside dpjit (#957)
  • Backported the split parfor pass from upstream Numba. (#949)
  • Numba type aliases to numba_dpex. (#851)
  • Numba prange alias inside numba_dpex. (#957)
  • New LRU cache for kernels (#804) and funcs (#877)
  • New Range and NdRange classes for kernel submission that follow sycl's range
    and ndrange syntax. (#888)
  • Monkey pacthes to Numba 0.56.4 to support dpnp ufuncs, allocating dpnp
    arrays (#954)
  • New config flag (NUMBA_DPEX_DUMP_KERNEL_LLVM) to dump a kernel's
    LLVM IR (#924)
  • A badge to our gitter chatroom (#919)
  • A small script to update copyright headers (#917)
  • A new dpexrt_python extension to support USM allocators for Numba
    NRT_MemInfo (#902)
  • Updated examples for kernel API demonstrating compute-follows-data programming
    model. (#826)

Changed

  • CLK_GLOBAL_MEM_FENCE and CLK_LOCAL_MEM_FENCE flags renamed to
    GLOBAL_MEM_FENCE and LOCAL_MEM_FENCE. (#844)
  • Switched from Ubuntu-latest to Ubuntu-20.04 for conda package build (#836)
  • Rename USMNdArrayType to USMNdArray (#851)
  • Changes to the Numba type to represent dpnp ndarray typess now renamed to
    DpnpNdarray (#880)
  • Improved exceptions and user errors (#804)
  • Updated internal API for kernel interface with improved support for
    __sycl_usm_array_interface__ protocol (#804)
  • Pin generated spirv version for kernels to 1.1 (#885)
  • Rename DpexContext and DpexTypingContext to DpexKernelTarget and
    DpexKernelTypingContext (#887)
  • Renamed existing dpnp overloads that used stubs to dpnp_stubs_impl.py (#953)
  • Dpctl version requirement mismatch is now a warning and not an
    ImportError (#925)
  • Update to versioneer 0.28 (#827)
  • Update to dpctl 0.14 (#858)
  • Update linters: black to 23.1.0, isort to 5.12.0 (#900)
  • License in setup.py to match actual project licensing (#904)

Fixed

  • Kernel specialization, compute follows data programming model for
    kernels (#804)
  • Dispatcher/caching rewrite to address performance regression (#912, #896)
  • func decorator qualname ambiguation fix (#905)

Removed

  • Removes the numpy_usm_shared module from numba_dpex. (#841)
  • Removes the usage of llvmlite.llvmpy (#932)

Deprecated

  • Support for NumPy arrays as kernel arguments (#804)
  • Kernel argument access specifiers (#804)
  • Support for dpctl.device_context to launch kernels and njit offloading (#804)
  • Dpnp overloads using stubs. (#953)

0.19.0

21 Nov 23:25
53b3093
Compare
Choose a tag to compare
  • Updated toolchain with support for oneAPI 2023.0 and Numba 0.56.3

Added

  • Supported numba0.56. (#818)
  • Supported dpnp0.11 and dpctl0.14.
  • Added customized exception classes. (#798)

Fixed

  • Fixed a crash when calling take() for input array with non-integer values. (#771)
  • Fixed pairwise_distance.py to run on machine with no FP64 support in HW. (#806)