Release Halide v15.0.1 · halide/Halide

What's Changed

The Python binding of compile_to_callable() was not properly copying from device to host for output buffers, so output was typically black (or garbage) when used with a GPU target. (#7213)
The bin directory was missing from the installs.
Upgraded LLVM to 15.0.7
New in 15.0.0, but restated here for visibility: The target flag disable_llvm_loop_opt is deprecated, as it's now the default behavior. This means that we have turned off llvm's autovectorization and loop unrolling. This should not affect any schedules with manually-specified vectorization and unrolling, other than trimming code size a little. However, schedules that do not vectorize or unroll may slow down because they were (intentionally or not) relying on llvm to do it automatically. If you see a performance regression with Halide 15, try turning on the enable_llvm_loop_opt target flag.