What's Changed

General Notes

Support for the Vulkan API (w/SPIR-V codegen)
Support for WebGPU (experimental)
Improved Halide IR HTML Visualization
Fixed a regression in the Adams2019 auto-scheduler that disabled sub-tiling
Added GPU auto-scheduler (Anderson2021)

Efficient Automatic Scheduling of Imaging and Vision Pipelines for the GPU
Luke Anderson, Andrew Adams, Karima Ma, Tzu-Mao Li, Tian Jin, Jonathan Ragan-Kelley
Proceedings of the ACM on Programming Languages (OOPSLA 2021)

Deprecations / Removals

OpenGLCompute has been deprecated
ParamMap has been deprecated
Deprecated HVX_shared_object feature has been removed
References to deprecated fixed-point operators have been removed
Deprecated halide_target_feature_disable_llvm_loop_opt has been removed
Deprecated MIPS device support has been removed

Notable Fixes & Changes

Generate dot() in the Metal backend by @vksnk in #7085
Add evaluate() and evaluate_may_gpu() to Python bindings by @steven-johnson in #7108
Add support for generating LLVM vector predication intrinsics. by @zvookin in #7111
RISC V vector predication support intrinsics support by @zvookin in #7119
Add range-checking to Buffer objects in Python by @steven-johnson in #7128
Fix Python buffer handling by @steven-johnson in #7125
[WASM] Use rounding_mul_shift_right for q15mulr_sat_s pattern by @rootjalex in #7134
[x86] Generate AVX512 fixed-point instructions by @rootjalex in #7129
Fix readnone attribute for llvm 16 by @abadams in #7152
Call cache.clear between internal functions in CG_C by @steven-johnson in #7155
Add bfloat support to halide_type_to_string() by @steven-johnson in #7154
Factor simd_op_check into separate files by architecture. by @zvookin in #7163
Slightly improve error message for non-integer RDom min/extent by @abadams in #7151
Migrate from MCJIT to ORC JIT by @dkurt in #7166
Use n32:64 in RISC-V data layout by @dkurt in #7175
Don't attempt to use makecontext()/swapcontext() on Android by @steven-johnson in #7196
Add bridging for clang _Float16 type. by @zvookin in #7201
Fix issue with vector predicated comparison and select instructions. by @zvookin in #7205
Add RISC V zvl flag for LLVM version 16 or greater. by @zvookin in #7209
Extend LLVM IR type mangling to handle scalars. by @zvookin in #7212
Fix bitrot in PowerPC testing by @steven-johnson in #7211
Use aligned_alloc() as default allocator for HalideBuffer.h on most platforms by @steven-johnson in #7190
Tighten alignment promises for halide_malloc() by @steven-johnson in #7222
Fix some sources of signed integer overflow in the compiler by @abadams in #7231
Explicitly stage strided loads by @abadams in #7230
Remove deprecated halide_target_feature_disable_llvm_loop_opt by @steven-johnson in #7247
Conditional allocations shouldn't fail for size=0 in C++ backend (#7255) by @steven-johnson in #7256
Inline into extern function args during bounds inference by @abadams in #7261
Use ::aligned_alloc() instead of std::aligned_alloc() in HalideBuffer.h by @steven-johnson in #7268
Optimize Module::compile() for some edge cases by @steven-johnson in #7269
Drop support for MIPS (#7287) by @steven-johnson in #7289
Emit prototypes for destructor functions in C Backend by @steven-johnson in #7296
[HVX] Fix EliminateInterleaves by @rootjalex in #7279
Remove dependency on platform threads library by @alexreinking in #7297
Fix error of add_halide_generator in cross-compilation by @stevesuzuki-arm in #7283
Fix issue in add_halide_runtime in cross-compilation by @stevesuzuki-arm in #7284
Add workaround for the const-or-not user_context issue (#635) by @steven-johnson in #7291
[x86 & wasm] Split up double saturating-narrows from i32 by @rootjalex in #7280
Hoist vector slices using rewrite rules by @abadams in #7243
Improved halide_popcount by @Aelphy in #7225
halide_popcount<uint64_t> is broken by @steven-johnson in #7313
Fix segfault by nonconstant bound in Adams2019 by @stevesuzuki-arm in #7321
Make auto scheduler libs available in HalideHelpers package by @stevesuzuki-arm in #7285
Improve support for Arm baremetal compilation and runtime by @stevesuzuki-arm in #7286
Remove deprecated HVX_shared_object feature by @steven-johnson in #7331
Fix a subtle uninitialized-memory-read in Buffer::for_each_value() by @steven-johnson in #7330
Add a hook to Codegen_C::compile() by @steven-johnson in #7335
Tiny improvements in codegen in C backend by @steven-johnson in #7337
Devirtualize the protected compile() methods in Codegen_C by @steven-johnson in #7341
Fix tuple output bounds checks by @abadams in #7345
Change early-bound default args in Python bindings to late-bound by @steven-johnson in #7347
Fix Python error handling by @steven-johnson in #7352
Permit vectorization of non-recursive atomic operations by @abadams in #7346
Update WABT to 1.0.32; Increase stack size for WASM AOT apps by @steven-johnson in #7373
Bounds visitors for min/max were missing single_point mutated case by @abadams in #7377
Fix overflow in x86 absd lowering by @abadams in #7407
Add initial support for WebGPU by @jrprice in #6492
Use pmaddubsw for non-RDom horizontal widening adds by @abadams in #7440
Compute comparison masks in narrower types if possible by @abadams in #7392
Fix bugs in PyTorch codegen. by @Yongqi-Zhuo in #7443
Remove references to deprecated variants of fixed-point operators by @steven-johnson in #7457
Add GPU autoscheduler by @aekul in #6856
d3d12 runtime: replacing spinlocks by mutex objects by @slomp in #7489
Feature Enhancement: Halide IR HTML Visualization by @maaz139 in #7421
Deprecate ParamMap (#7121) by @steven-johnson in #7357
Forbid assigning to Buffer(Expr) by introducing an intermediate type. by @abadams in #7517
[vulkan phase2] Vulkan Runtime by @derek-gerstmann in #6924
Add libfuzzer compatible fuzz harness by @silvergasp in #7512
fuzz: Port correctness/cse fuzzer over to libfuzzer by @silvergasp in #7543
metal : replacing spinlock by mutex by @slomp in #7532
Fix save_tiff() PlanarConfig assignment for monochrome inputs by @philboske in #7568
Fix various compilation errors with AppleClang 14.0.3 by @steven-johnson in #7578
fuzz: Add libfuzzer compatible bounds fuzzer by @silvergasp in #7549
Significant change to RISC V and scalable vector code generation. by @zvookin in #7616
Fix inverted may_subtile checks by @abadams in #7626
Deprecate OpenGLCompute for Halide 16 by @shoaibkamil in #7627

New Contributors

@sashashura made their first contribution in #7136
@twesterhout made their first contribution in #7315
@terryheo made their first contribution in #7323
@adrian-lebioda made their first contribution in #7379
@Ttayu made their first contribution in #7402
@Yongqi-Zhuo made their first contribution in #7443
@aekul made their first contribution in #6856
@zhen8838 made their first contribution in #7494
@maaz139 made their first contribution in #7421
@silvergasp made their first contribution in #7512
@dbabokin made their first contribution in #7545
@philboske made their first contribution in #7568

Full Changelog: v15.0.1...v16.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Halide v16.0.0

What's Changed

General Notes

Deprecations / Removals

Notable Fixes & Changes

New Contributors

Contributors