Halide 8.0.0
New features since last release include:
- Generate custom pytorch ops from Halide pipelines
- Automatic differentiation of Halide pipelines
- A Webassembly backend
- A Direct3D backend
- An opt-in caching allocator for cuda to reduce the amount of time spent in cuMemAlloc, cuMemFree
- float16 and bfloat16 support
- Faster compilation of very large pipelines
- New ways to assert properties of arguments, including unchecked assertions, and more aggressive simplifications that exploit these
- The ability to place Funcs in stack/heap/shared/register memory explicitly with store_in
- Runtime configuration of Generator inputs/outputs
- Support for DMA transfers on Hexagon
- Generate python extension modules from Halide pipelines
- Lower overhead when calling realize repeatedly on small pipelines
- Optional strict floating point semantics for single expressions or entire pipelines
- Producer-consumer task parallelism with Func::async
- Numerous improvements to Halide::Runtime::Buffer. Consider replacing your custom halide_buffer_t wrappers with it.
- Many many more small improvements and bug fixes (it has been a while since our last release)
Edit: This release was renamed to use the included llvm version instead of the date. It was formerly named Halide 2019/08/27