Compared to v11, this release updated CUDA dependencies to CUDA 11.8.0, cuDNN 8.6.0 and TensorRT 8.5.1:

Added support for the NVIDIA 40 series GPUs.
Added support for RIFE on the trt backend.

Known issue

Performance of the OV_CPU or ORT_CUDA(fp16=True) backends for RIFE is lower than expected, which is under investigation. Please consider ORT_CPU or ORT_CUDA(fp16=False) for now.
The NCNN_VK backend does not support RIFE.

Installation Notes

For some advanced features, vsmlrt.py requires numpy and onnx packages to be available. You might need to run pip install onnx numpy.

Benchmark

previous benchmark

Configuration: NVIDIA RTX 3090, driver 526.47, windows server 2019, vs r60, python 3.11.0, 1080p fp16

Backends: ort-cuda, trt from vs-mlrt v12.

For the trt backend, the engine is created without CUDA_MODULE_LOADING=LAZY environment variable and with it during benchmarking to reduce device memory consumption.

Data format: fps / GPU memory usage (MB)

rife(model=44, 1920x1088)

backend	1 stream	2 streams
ort-cuda	53.62/1771	83.34/2748
trt	71.30/ 626	107.3/ 962

dpir color

backend	1 stream	2 streams
ort-cuda	4.64/3230
trt	10.32/1992	11.61/3475

waifu2x upconv_7

backend	1 stream	2 streams
ort-cuda	11.07/5916	15.04/10899
trt	18.38/2092	31.64/ 3848

waifu2x cunet

backend	1 stream	2 streams
ort-cuda	4.63/8541	5.32/16148
trt	11.44/4771	15.59/ 8972

realesrgan v2/v3

backend	1 stream	2 streams
ort-cuda	8.84/2283	11.10/4202
trt	14.59/1324	21.37/2174

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v12: latest CUDA libraries