Skip to content

Version 0.4.0

Compare
Choose a tag to compare
@benvanwerkhoven benvanwerkhoven released this 09 Apr 11:50
· 1155 commits to master since this release

This version adds a great deal of new functionality and extra flexibility and additional control to the user over what is being benchmarked and when. From the CHANGELOG:

Added

  • support for (lambda) function instead of list of strings for restrictions
  • support for (lambda) function instead of list for specifying grid divisors
  • support for (lambda) function instead of tuple for specifying problem_size
  • function to store the top tuning results
  • function to create header file with device targets from stored results
  • support for using tuning results in PythonKernel
  • option to control measurements using observers
  • support for NVML tunable parameters
  • option to simulate auto-tuning searches from existing cache files
  • Cupy backend to support C++ templated CUDA kernels
  • support for templated CUDA kernels using PyCUDA backend
  • documentation on tunable parameter vocabulary