Release v5 fp16 support & production ready! · AmusementClub/vs-mlrt

Changelog:

added fp16 support to vs-ov and vs-ort (input model is still fp32, and these filters will convert it to fp16 on the fly). Now all three backends support inference with fp16 (though using fp16 mainly benefit vs-ort's CUDA backend).
~~fixed vs-ov spurious logging messages to stdout which interferes with vspipe | x265 pipeline (requires patched openvino)~~ Turns out the fix is not picked by the CI. Please use v6 for vs-ov.
changes to the vs-trt backend vsmlrt.Backend.TRT() of the vsmlrt.py wrapper
- max_shapes defaults to tile size now (as tensorrt GPU memory usage is related to max_shapes rather than the actual shape used in inference, this should help saving GPU memory);
- the default opt_shapes is None now, which means it will be set to the actual tilesize in use: this is especially beneficial for large models like DPIR. If you prefer faster engine build times, you should set opt_shapes=(64, 64) to restore previous behavior. This change also makes it easier to use the tiles parameter (as in this case, you generally don't know the exact inference shape)
- changed default cache & engine directory: first try saving the engine and cache file to the same directory as the onnx model and if not writable, use the system temporary directory (on the same drive as the onnx model files).
- fixed a bug when reusing the same backend variable for different filters

vsmlrt-cuda and model packages are identical to v4.

PS: we have successfully used both ~~vs-ov and~~ vs-trt in production anime encodings, so this release should be ready for production. As always, issues and suggestions welcome.
Update: turns out vs-ov is broken. The fix to openvino is not correctly picked up by the CI pipeline. Please use v6 for vs-ov.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v5 fp16 support & production ready!