Release v0.16.0 · IntelPython/dpnp

Summary

This release reaches an important milestone by making offloading fully asynchronous. Calls to dpnp submit tasks for execution to DPC++ runtime and return without waiting for execution of these tasks to finish. The sequential semantics a user comes to expect from execution of Python script is preserved though.

In addition, this release completes implementation of dpnp.fft module and adds several new array manipulation, indexing and elementwise routines. Moreover, it adds support to build dpnp for Nvidia GPUs.

DPNP is now compatible with NumPy 2.0.

Details

Added

Added implementation of dpnp.gradient function #1859
Added implementation of dpnp.sort_complex function #1864
Added implementation of dpnp.fft.fft and dpnp.fft.ifft functions #1879
Added implementation of dpnp.isneginf and dpnp.isposinf functions #1888
Added implementation of dpnp.fft.fftfreq and dpnp.fft.rfftfreq functions #1898
Added implementation of dpnp.fft.fftshift and dpnp.fft.ifftshift functions #1900
Added implementation of dpnp.isreal, dpnp.isrealobj, dpnp.iscomplex, and dpnp.iscomplexobj functions #1916
Added support to build dpnp for Nvidia GPU #1926
Added implementation of dpnp.fft.rfft and dpnp.fft.irfft functions #1928
Added implementation of dpnp.nextafter function #1938
Added implementation of dpnp.trim_zero function #1941
Added implementation of dpnp.fft.hfft and dpnp.fft.ihfft functions #1954
Added implementation of dpnp.logaddexp2 function #1955
Added implementation of dpnp.flatnonzero function #1956
Added implementation of dpnp.float_power function #1957
Added implementation of dpnp.fft.fft2, dpnp.fft.ifft2, dpnp.fft.fftn, and dpnp.fft.ifftn functions #1961
Added implementation of dpnp.array_equal and dpnp.array_equiv functions #1965
Added implementation of dpnp.nan_to_num function #1966
Added implementation of dpnp.fix function #1971
Added implementation of dpnp.fft.rfft2, dpnp.fft.irfft2, dpnp.fft.rfftn, and dpnp.fft.irfftn functions #1982
Added implementation of dpnp.argwhere function #2000
Added implementation of dpnp.real_if_close function #2002
Added implementation of dpnp.ndim and dpnp.size functions #2014
Added implementation of dpnp.append and dpnp.asarray_chkfinite functions #2015
Added implementation of dpnp.array_split, dpnp.split, dpnp.hsplit, dpnp.vsplit, and dpnp.dsplit functions #2017
Added runtime dependency on intel-gpu-ocl-icd-system package #2023
Added implementation of dpnp.ravel_multi_index and dpnp.unravel_index functions #2022
Added implementation of dpnp.resize and dpnp.rot90 functions #2030
Added implementation of dpnp.require function #2036

Changed

Extended pre-commit pylint check to dpnp.fft module #1860
Reworked vm vector math backend to reuse dpctl.tensor functions around unary and binary functions #1868
Extended dpnp.ndarray.astype method to support device keyword argument #1870
Improved performance of dpnp.linalg.solve by implementing a dedicated kernel for its batch implementation #1877
Extended dpnp.fabs to support order and out keyword arguments by writing a dedicated kernel for it #1878
Extended dpnp.linalg module to support usm_ndarray as input #1880
Reworked dpnp.mod implementation to be an alias for dpnp.remainder #1882
Removed the legacy implementation of linear algebra functions from the backend #1887
Removed the legacy implementation of elementwise functions from the backend #1890
Extended dpnp.all and dpnp.any to support out keyword argument #1893
Reworked dpnp.repeat to add a explicit type check of input array #1894
Improved performance of different functions by adopting asynchronous implementation of dpctl #1897
Extended dpnp.fmax and dpnp.fmin to support order and out keyword arguments by writing dedicated kernels for them #1905
Removed the legacy implementation of array creation and manipulation functions from the backend #1903
Extended dpnp.extract implementation to align with NumPy #1906
Reworked backend implementation to align with non-backward compatible changes in DPC++ 2025.0 #1907
Removed the legacy implementation of indexing functions from the backend #1908
Extended dpnp.take implementation to align with NumPy #1909
Extended dpnp.place implementation to align with NumPy #1912
Reworked the implementation of indexing functions to avoid unnecessary casting to dpnp_array when input is usm_ndarray #1913
Reduced code duplication in the implementation of sorting functions #1914
Removed the obsolete dparray interface #1915
Improved performance of dpnp.linalg module for BLAS routines by adopting asynchronous implementation of dpctl #1919
Relocated dpnp.einsum utility functions to a separate file #1920
Improved performance of dpnp.linalg module for LAPACK routines by adopting asynchronous implementation of dpctl #1922
Reworked dpnp.matmul to allow larger batch size to be used #1927
Removed data synchronization where it is not needed #1930
Leveraged dpctl.tensor implementation for dpnp.where to support scalar as input #1932
Improved performance of dpnp.linalg.eigh by implementing a dedicated kernel for its batch implementation #1936
Reworked dpnp.isclose and dpnp.allclose to comply with compute follows data approach #1937
Extended dpnp.deg2rad and dpnp.radians to support order and out keyword arguments by writing dedicated kernels for them #1943
dpnp uses pybind11 2.13.1 #1944
Extended dpnp.degrees and dpnp.rad2deg to support order and out keyword arguments by writing dedicated kernels for them #1949
Extended dpnp.unwrap to support all keyword arguments provided by NumPy #1950
Leveraged dpctl.tensor implementation for dpnp.count_nonzero function #1962
Leveraged dpctl.tensor implementation for dpnp.diff function #1963
Leveraged dpctl.tensor implementation for dpnp.take_along_axis function #1969
Reworked dpnp.ediff1d implementation through existing functions instead of a separate kernel #1970
Reworked dpnp.unique implementation through existing functions when axis is given otherwise through leveraging dpctl.tensor implementation #1972
Improved performance of dpnp.linalg.svd by implementing a dedicated kernel for its batch implementation #1936
Leveraged dpctl.tensor implementation for shape.setter method #1975
Extended dpnp.ndarray.copy to support compute follow data keyword arguments #1976
Reworked dpnp.select implementation through existing functions instead of a separate kernel #1977
Leveraged dpctl.tensor implementation for dpnp.from_dlpack and dpnp.ndarray.__dlpack__ functions #1980
Reworked dpnp.linalg module backend implementation for BLAS rouitnes to work with OneMKL interfaces #1981
Reworked dpnp.ediff1d implementation to reduce code duplication #1983
dpnp can be used with any NumPy from 1.23 to 2.0 #1985
Reworked dpnp.unique implementation to properly handle NaNs values #1972
Removed dpnp.issubcdtype per NumPy 2.0 recommendation #1996
Reworked dpnp.unique implementation to align with NumPy 2.0 #1999
Reworked dpnp.linalg.solve backend implementation to work with OneMKL Interfaces #2001
Reworked dpnp.trapezoid implementation through existing functions instead of falling back on NumPy #2003
Added copy keyword to dpnp.array to align with NumPy 2.0 #2006
Extended dpnp.heaviside to support order and out keyword arguments by writing dedicated kernel for it #2008
dpnp uses pybind11 2.13.5 #2010
Added COMPILER_VERSION_2025_OR_LATER flag to be able to run dpnp.fft module with both 2024.2 and 2025.0 versions of the compiler #2025
Cleaned up an implementation of dpnp.gradient by removing obsolete TODO which is not going to be done #2032
Updated Array Manipulation Routines page in documentation to add missing functions and to remove duplicate entries #2033
dpnp uses pybind11 2.13.6 #2041
Updated dpnp.fft backend to depend on INTEL_MKL_VERSION flag to ensures that the appropriate code segment is executed based on the version of OneMKL #2035
Use dpctl::tensor::alloc_utils::sycl_free_noexcept instead of sycl::free in host_task tasks associated with life-time management of temporary USM allocations #2058
Improved implementation of dpnp.kron to avoid unnecessary copy for non-contiguous arrays #2059
Updated the test suit for dpnp.fft module #2071
Reworked dpnp.clip implementation to align with Python Array API 2023.12 specification #2048
Skipped outdated tests for dpnp.linalg.solve due to compatibility issues with NumPy 2.0 #2074
Updated installation instructions #2098

Fixed

Resolved an issue with dpnp.matmul when an f_contiguous out keyword is passed to the the function #1872
Resolved a possible race condition in dpnp.inv #1940
Resolved an issue with failing tests for dpnp.append when running on a device without fp64 support #2034
Resolved an issue with input array of usm_ndarray passed into dpnp.ix_ #2047
Added a workaround to prevent crash in tests on Windows in internal CI/CD (when running on either Lunar Lake or Arrow Lake) #2062
Fixed a crash in dpnp.choose caused by missing control of releasing temporary allocated device memory #2063
Resolved compilation warning and error while building in debug mode #2066
Fixed an issue with asynchronous execution in dpnp.fft module #2067

Full Changelog: 0.15.0...0.16.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.16.0

Summary

Details

Added

Changed

Fixed