Version 0.1.8
benvanwerkhoven
released this
23 Nov 21:01
·
1454 commits
to master
since this release
Version 0.1.8 brings many improvements, mostly focused on user friendliness. The installation process of optional dependencies is simplified as you can now use extras with pip. For example, pip install kernel_tuner[cuda]
can be used to install both Kernel Tuner and the optional dependency PyCuda. In addition, Version 0.1.8 introduces many more checks on the user input that you pass to tune_kernel and run_kernel. For example, the kernel source code is parsed to see if the signature matches the argument list. The additional checks on input should make it easier to use and debug programs using Kernel Tuner. For a more detailed overview of the changes, see below:
[0.1.8] - 2017-11-23
Changed
- bugfix for when using iterations smaller than 3
- the install procedure now uses extras, e.g. [cuda,opencl]
- option quiet makes tune_kernel completely quiet
- extensive updates to documentation
Added
- type checking for kernel arguments and answers lists
- checks for reserved keywords in tunable paramters
- checks for whether thread block dimensions are specified
- printing units for measured time with CUDA and OpenCL
- option to print all measured execution times