Releases · KernelTuner/kernel_tuner

04 Apr 20:03

1c96693

Version 1.0 Latest

Latest

Finally, the Version 1.0 release is here! The software has been stable and ready for production use for quite some time now and after being in beta for about a half a year, we are confident that the current version of the software deserves to mark the first major release of Kernel Tuner.

Version 1.0 integrates a lot of new functionality, including blazing fast search space construction, support for tuning HIP kernels on AMD GPUs, new functionality for mixed precision and accuracy tuning, experimental support for tuning OpenACC programs, a conda package installer for Kernel Tuner, and many more changes and additions.

I would like to thank every one involved in the development of Kernel Tuner of the past years! Special thanks to the Kernel Tuner developers team for their continued support of the project!

From the Changelog

HIP backend to support tuning HIP kernels on AMD GPUs
Experimental features for mixed-precision and accuracy tuning
Experimental features for OpenACC tuning
Major speedup due to new parser and using revamped python-constraint for searchspace building
Implemented ability to use PySMT and ATF for searchspace building
Added Poetry for dependency and build management
Switched from setup.py and setup.cfg to pyproject.toml for centralized metadata, added relevant tests
Updated GitHub Action workflows to use Poetry
Updated dependencies, most notably NumPy is no longer version-locked as scikit-opt is no longer a dependency
Documentation now uses pyproject.toml metadata, minor fixes and changes to be compatible with updated dependencies
Set up Nox for testing on all supported Python versions in isolated environments
Added linting information, VS Code settings and recommendations
Discontinued use of OrderedDict, as all dictionaries in the Python versions used are already ordered
Dropped Python 3.7 support

Merged Pull Requests

HIP Backend by @MiloLurati in #199
Accuracy tuning by @stijnh in #189
Fix issue where HIP backend fails due to invalid arguments type by @stijnh in #216
Searchspace improvements and project meta modernization by @fjwillemsen in #214
Minor bugfix by @isazi in #219
OpenACC support by @isazi in #197
Fixed broken tests as per issue #217 by @fjwillemsen in #220
Fix snap_to_nearest on non-numeric parameters by @stijnh in #221
expand documentation on backends by @benvanwerkhoven in #213
Add support for passing cupy arrays to "C" lang by @bouweandela in #226
improve code quality of cache file related functions by @benvanwerkhoven in #240
New readme by @benvanwerkhoven in #231

New Contributors

@MiloLurati made their first contribution in #199
@dependabot made their first contribution in #222
@bouweandela made their first contribution in #226

Full Changelog: 0.4.5...1.0

Contributors

isazi, stijnh, and 5 other contributors

Assets 2

07 Dec 08:19

fjwillemsen

1.0.0b6

66428e3

Version 1.0.0b6 Pre-release

Pre-release

This is a beta release for early access to the new features. Not intended for production use.

The release contains:

Inclusion of tests in the source package, as requested in #225
Updated dependencies

Assets 2

01 Nov 14:11

fjwillemsen

1.0.0b5

08fb58e

Version 1.0.0b5 Pre-release

Pre-release

This is a beta release for early access to the new features. Not intended for production use.

The release contains:

Expanded documentation on backends by @benvanwerkhoven in #213
A fix for an issue that could cause incorrect conversion to Constraint
Extended tests to detect this
Bump urllib3 from 2.0.6 to 2.0.7 by @dependabot in #222
Updated dependencies

Full Changelog: 1.0.0b4...1.0.0b5

Contributors

benvanwerkhoven and dependabot

Assets 2

22 Oct 14:11

fjwillemsen

1.0.0b4

d36a5eb

Version 1.0.0b4 Pre-release

Pre-release

This is a beta release for early access to the new features. Not intended for production use.

This release contains several improvements:

nvidia-ml-py added to tutorial extra dependencies.
Additional checks for coherent Poetry configuration and warning in case of outdated development environment.
Updated dependencies.

Assets 2

12 Oct 13:02

fjwillemsen

1.0.0b3

e980b23

Version 1.0.0b3 Pre-release

Pre-release

This is a beta release for early access to the new features. Not intended for production use.

This version contains several bugfixes:

Fix snap_to_nearest on non-numeric parameters by @stijnh in #221
Fixed an issue where some restrictions would not be recognized by the old check_restrictions function.
Fixed an issue where bayes_opt would not handle pruned parameters correctly.

Full Changelog: 1.0.0b2...1.0.0b3

Contributors

stijnh

Assets 2

11 Oct 16:37

fjwillemsen

1.0.0b2

0e009fd

Version 1.0.0b2 Pre-release

Pre-release

This is a beta release for early access to the new features. Not intended for production use.

Full Changelog: 1.0.0b1...1.0.0b2

Assets 2

11 Oct 07:03

fjwillemsen

1.0.0b1

42d8bfe

Version 1.0.0 beta 1 Pre-release

Pre-release

This is a beta release for early access to the new features. Not intended for production use.

What's Changed

HIP Backend by @MiloLurati in #199
Accuracy tuning by @stijnh in #189
Fix issue where HIP backend fails due to invalid arguments type by @stijnh in #216
Searchspace improvements and project meta modernization by @fjwillemsen in #214
Minor bugfix by @isazi in #219
OpenACC support by @isazi in #197
Fixed broken tests as per issue #217 by @fjwillemsen in #220

New Contributors

@MiloLurati made their first contribution in #199

Full Changelog: 0.4.5...1.0.0b1

Contributors

isazi, stijnh, and 2 other contributors

Assets 2

01 Jun 20:11

benvanwerkhoven

0.4.5

b3ff4cd

Version 0.4.5

Version 0.4.5 adds support of using PMT in combination with Kernel Tuner enabling power and energy measurements on a wide range of devices. In addition, we have worked extensively on the internals of Kernel Tuner and the interfaces of the separate components that together make up Kernel Tuner. Along with a few bugfixes, fixes of small errors in examples and documentation.

[0.4.5] - 2023-06-01

Added

PMTObserver to measure power and energy on various platforms

Changed

Improved functionality for storing output and metadata files
Updated PowerSensorObserver to support PowerSensor3
Refactored interal interfaces of runners and backends
Bugfix in interface to set objective and optimization direction

Assets 2

09 Mar 11:21

benvanwerkhoven

0.4.4

d0dc834

Version 0.4.4

Version 0.4.4 adds extended support for energy efficiency tuning. In particular, with the new capability to fit a performance model to the target GPUs power-frequency curve. How to use these features is demonstrated in:
https://github.com/KernelTuner/kernel_tuner/blob/master/examples/cuda/going_green_performance_model.py

And described in the paper:

Going green: optimizing GPUs for energy efficiency through model-steered auto-tuning
R. Schoonhoven, B. Veenboer, B. van Werkhoven, K. J. Batenburg
International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) at Supercomputing (SC22) 2022
https://arxiv.org/abs/2211.07260

Other than that, we've implemented a new output and metadata JSON format that adheres to the 'T4' auto-tuning schema created by the auto-tuning community at the Lorentz Center workshop in March 2022.

From the changelog:

[0.4.4] - 2023-03-09

Added

Support for using time_limit in simulation mode
Helper functions for energy tuning
Example to show ridge frequency and power-frequency model
Functions to store tuning output and metadata

Changed

Changed what timings are stored in cache files
No longer inserting partial loop unrolling factor of 0 in CUDA

Assets 2

19 Oct 15:45

benvanwerkhoven

0.4.3

8792726

Version 0.4.3

The version 0.4.3 release consists of a large number of changes to the internals of Kernel Tuner, including the addition of a new backend based on Nvidia's official Python bindings for CUDA, as well as improved functionality for tuning energy efficiency, e.g. measuring core voltages, the measurement of power and the interface with NVML has also improved a lot.

Some of the changes are also in the "externals" of Kernel Tuner. In the sense that we have migrated from https://github.com/benvanwerkhoven/ to https://github.com/KernelTuner. The goal of this move is to bring the collection of repositories belonging to the larger Kernel Tuner project under one organization.

From the Changelog:

[0.4.3] - 2022-10-19

Added

A new backend that uses Nvidia cuda-python
Support for locked clocks in NVMLObserver
Support for measuring core voltages using NVML
Support for custom preprocessor definitions
Support for boolean scalar arguments in PyCUDA backend

Changed

Migrated from github.com/benvanwerkhoven to github.com/KernelTuner
Significant update to the documentation pages
Unified benchmarking loops across backends
Backends are no longer context managers
Replaced the method for measuring power consumption using NVML
Improved NVML measurements of temperature and clock frequencies
bugfix in parse_restrictions when using and/or in expressions
bugfix in GreedyILS when using neighbor method "adjacent"
bugfix in Bayesian Optimization for small problems

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

From the Changelog

Merged Pull Requests

New Contributors

Contributors

Contributors

Contributors

What's Changed

New Contributors

Contributors

[0.4.5] - 2023-06-01

Added

Changed

Version 0.4.4

[0.4.4] - 2023-03-09

Added

Changed

From the Changelog:

[0.4.3] - 2022-10-19

Added

Changed

Releases: KernelTuner/kernel_tuner

Version 1.0

From the Changelog

Merged Pull Requests

New Contributors

Contributors

Version 1.0.0b6

Version 1.0.0b5

Contributors

Version 1.0.0b4

Version 1.0.0b3

Contributors

Version 1.0.0b2

Version 1.0.0 beta 1

What's Changed

New Contributors

Contributors

Version 0.4.5

[0.4.5] - 2023-06-01

Added

Changed

Version 0.4.4

Version 0.4.4

[0.4.4] - 2023-03-09

Added

Changed

Version 0.4.3

From the Changelog:

[0.4.3] - 2022-10-19

Added

Changed