v5 fp16 support & production ready!
Changelog:
- added fp16 support to vs-ov and vs-ort (input model is still fp32, and these filters will convert it to fp16 on the fly). Now all three backends support inference with fp16 (though using fp16 mainly benefit vs-ort's CUDA backend).
fixed vs-ov spurious logging messages to stdout which interferes withTurns out the fix is not picked by the CI. Please use v6 for vs-ov.vspipe | x265
pipeline (requires patched openvino)- changes to the vs-trt backend
vsmlrt.Backend.TRT()
of thevsmlrt.py
wrappermax_shapes
defaults to tile size now (as tensorrt GPU memory usage is related tomax_shapes
rather than the actual shape used in inference, this should help saving GPU memory);- the default
opt_shapes
isNone
now, which means it will be set to the actualtilesize
in use: this is especially beneficial for large models like DPIR. If you prefer faster engine build times, you should setopt_shapes=(64, 64)
to restore previous behavior. This change also makes it easier to use thetiles
parameter (as in this case, you generally don't know the exact inference shape) - changed default cache & engine directory: first try saving the engine and cache file to the same directory as the onnx model and if not writable, use the system temporary directory (on the same drive as the onnx model files).
- fixed a bug when reusing the same backend variable for different filters
vsmlrt-cuda and model packages are identical to v4.
PS: we have successfully used both vs-ov and vs-trt in production anime encodings, so this release should be ready for production. As always, issues and suggestions welcome.
Update: turns out vs-ov is broken. The fix to openvino is not correctly picked up by the CI pipeline. Please use v6 for vs-ov.