Releases · KernelTuner/kernel_tuner

23 May 14:59

0.4.2

7f2bdc8

Version 0.4.2

Version 0.4.2 includes a lot of work on the search space representation, application of restrictions, and optimization strategies. In addition to the addition of several new optimization strategies, most optimization strategies should see improved performance both in terms of the number of evaluated kernel configurations as well as execution time.

Added

new optimization strategies: dual annealing, greedly ILS, ordered greedy MLS, greedy MLS
support for constant memory in cupy backend
constraint solver to cut down time spent in creating search spaces
support for custom tuning objectives
support for max_fevals and time_limit in strategy_options of all strategies

Removed

alternative Bayesian Optimization strategies that could not be used directly
C++ wrapper module that was too specific and hardly used

Changed

string-based restrictions are compiled into functions for improved performance
genetic algorithm, MLS, ILS, random, and simulated annealing use new search space object
diff evo, firefly, PSO are initialized using population of all valid configurations
all strategies except brute_force strictly adhere to max_fevals and time_limit
simulated annealing adapts annealing schedule to max_fevals if supplied
minimize, basinhopping, and dual annealing start from a random valid config

Assets 2

10 Sep 12:50

benvanwerkhoven

0.4.1

1da9a60

Version 0.4.1

This version adds a brand new Bayesian Optimization strategy, as well as some smaller features and fixes.

[0.4.1] - 2021-09-10

Added

support for PyTorch Tensors as input data type for kernels
support for smem_args in run_kernel
support for (lambda) function and string for dynamic shared memory size
a new Bayesian Optimization strategy

Changed

optionally store the kernel_string with store_results
improved reporting of skipped configurations

Assets 2

09 Apr 11:50

benvanwerkhoven

0.4.0

b09efbc

Version 0.4.0

This version adds a great deal of new functionality and extra flexibility and additional control to the user over what is being benchmarked and when. From the CHANGELOG:

Added

support for (lambda) function instead of list of strings for restrictions
support for (lambda) function instead of list for specifying grid divisors
support for (lambda) function instead of tuple for specifying problem_size
function to store the top tuning results
function to create header file with device targets from stored results
support for using tuning results in PythonKernel
option to control measurements using observers
support for NVML tunable parameters
option to simulate auto-tuning searches from existing cache files
Cupy backend to support C++ templated CUDA kernels
support for templated CUDA kernels using PyCUDA backend
documentation on tunable parameter vocabulary

Assets 2

04 Nov 19:56

benvanwerkhoven

0.3.2

2e9138f

Version 0.3.2

This version adds several new and recent features. Most importantly is the new feature to specify user-defined metrics for Kernel Tuner to compute along with the benchmarking results. User-defined metrics are composable, so you can define metrics that build upon other metrics. The documentation pages have also been updated to include this new feature and other recent changes.

An important change that might influence benchmark results reported by Kernel Tuner is the fact that the runner will now do a warm up of the device using the first kernel in the parameter space. This is to remove any startup or cold start delays that were significantly slowing down the first benchmarked kernel on many devices.

From the changelog:

[0.3.2] - 2020-11-04

Added

support loop unrolling using params that start with loop_unroll_factor
always insert "define kernel_tuner 1" to allow preprocessor ifdef kernel_tuner
support for user-defined metrics
support for choosing the optimization starting point x0 for most strategies

Changed

more compact output is printed to the terminal
sequential runner runs first kernel in the parameter space to warm up device
updated tutorials to demonstrate use of user-defined metrics

Assets 2

11 Jun 14:50

benvanwerkhoven

0.3.1

6353aea

Version 0.3.1

A small release for 2 small new features and a bugfix for older GPUs.

[0.3.1] - 2020-06-11

Added

kernelbuilder functionality for including kernels in Python applications
smem_args option for dynamically allocated shared memory in CUDA kernels

Changed

bugfix for NVML Error on Nvidia devices without internal current sensor

Assets 2

20 Dec 16:40

benvanwerkhoven

0.3.0

af91017

Version 0.3.0

This is the release of version 0.3.0 of Kernel Tuner. We have done a lot of work on the internals of Kernel Tuner. This release fixes several issues, adds and extends new features, and simplifies the user interface.

[0.3.0] - 2019-12-20

Changed

fix for output checking, custom verify functions are called just once
benchmarking now returns multiple results not only time
more sophisticated implementation of genetic algorithm strategy
how the "method" option is passed, now use strategy_options

Added

Bayesian Optimizaton strategy, use strategy="bayes_opt"
support for kernels that use texture memory in CUDA
support for measuring energy consumption of CUDA kernels
option to set strategy_options to pass strategy specific options
option to cache and restart from tuned kernel configurations cachefile

Removed

Python 2 support, it may still work but we no longer test for Python 2
Noodles parallel runner

Assets 2

16 Nov 16:07

benvanwerkhoven

0.2.0

69759ae

Version 0.2.0

Version 0.2.0 adds a large number of search optimization algorithms and basic support for testing and tuning Fortran kernels.

Changed

no longer replacing kernel names with instance strings during tuning
bugfix in tempfile creation that lead to too many open files error

Added

A minimal Fortran example and basic Fortran support
Particle Swarm Optimization strategy, use strategy="pso"
Simulated Annealing strategy, use strategy="simulated_annealing"
Firefly Algorithm strategy, use strategy="firefly_algorithm"
Genetic Algorithm strategy, use strategy="genetic_algorithm"

Assets 2

18 Apr 10:10

benvanwerkhoven

0.1.9

e50ed8d

Version 0.1.9

[0.1.9] - 2018-04-18

Changed

bugfix for C backend for byte array arguments
argument type mismatches throw warning instead of exception

Added

wrapper functionality to wrap C++ functions
citation file and zenodo doi generation for releases

Assets 2

23 Nov 21:01

benvanwerkhoven

0.1.8

31fe668

Version 0.1.8

Version 0.1.8 brings many improvements, mostly focused on user friendliness. The installation process of optional dependencies is simplified as you can now use extras with pip. For example, pip install kernel_tuner[cuda] can be used to install both Kernel Tuner and the optional dependency PyCuda. In addition, Version 0.1.8 introduces many more checks on the user input that you pass to tune_kernel and run_kernel. For example, the kernel source code is parsed to see if the signature matches the argument list. The additional checks on input should make it easier to use and debug programs using Kernel Tuner. For a more detailed overview of the changes, see below:

[0.1.8] - 2017-11-23

Changed

bugfix for when using iterations smaller than 3
the install procedure now uses extras, e.g. [cuda,opencl]
option quiet makes tune_kernel completely quiet
extensive updates to documentation

Added

type checking for kernel arguments and answers lists
checks for reserved keywords in tunable paramters
checks for whether thread block dimensions are specified
printing units for measured time with CUDA and OpenCL
option to print all measured execution times

Assets 2

10 Nov 14:49

benvanwerkhoven

0.1.7

1d8fe7b

Version 0.1.7

[0.1.7] - 2017-10-11

Changed

bugfix install when scipy not present
bugfix for GPU cleanup when using Noodles runner
reworked the way strings are handled internally

Added

option to set compiler name, when using C backend

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added

Removed

Changed

[0.4.1] - 2021-09-10

Added

Changed

Added

Version 0.3.2

[0.3.2] - 2020-11-04

Added

Changed

[0.3.1] - 2020-06-11

Added

Changed

Version 0.3.0

[0.3.0] - 2019-12-20

Changed

Added

Removed

Version 0.2.0

Changed

Added

[0.1.9] - 2018-04-18

Changed

Added

[0.1.8] - 2017-11-23

Changed

Added

[0.1.7] - 2017-10-11

Changed

Added

Releases: KernelTuner/kernel_tuner

Version 0.4.2

Added

Removed

Changed

Version 0.4.1

[0.4.1] - 2021-09-10

Added

Changed

Version 0.4.0

Added

Version 0.3.2

Version 0.3.2

[0.3.2] - 2020-11-04

Added

Changed

Version 0.3.1

[0.3.1] - 2020-06-11

Added

Changed

Version 0.3.0

Version 0.3.0

[0.3.0] - 2019-12-20

Changed

Added

Removed

Version 0.2.0

Version 0.2.0

Changed

Added

Version 0.1.9

[0.1.9] - 2018-04-18

Changed

Added

Version 0.1.8

[0.1.8] - 2017-11-23

Changed

Added

Version 0.1.7

[0.1.7] - 2017-10-11

Changed

Added