Skip to content

v12: latest CUDA libraries

Compare
Choose a tag to compare
@github-actions github-actions released this 01 Nov 10:57
· 240 commits to master since this release

Compared to v11, this release updated CUDA dependencies to CUDA 11.8.0, cuDNN 8.6.0 and TensorRT 8.5.1:

  • Added support for the NVIDIA 40 series GPUs.
  • Added support for RIFE on the trt backend.

Known issue

  • Performance of the OV_CPU or ORT_CUDA(fp16=True) backends for RIFE is lower than expected, which is under investigation. Please consider ORT_CPU or ORT_CUDA(fp16=False) for now.
  • The NCNN_VK backend does not support RIFE.

Installation Notes

For some advanced features, vsmlrt.py requires numpy and onnx packages to be available. You might need to run pip install onnx numpy.

Benchmark

previous benchmark

Configuration: NVIDIA RTX 3090, driver 526.47, windows server 2019, vs r60, python 3.11.0, 1080p fp16

Backends: ort-cuda, trt from vs-mlrt v12.

For the trt backend, the engine is created without CUDA_MODULE_LOADING=LAZY environment variable and with it during benchmarking to reduce device memory consumption.

Data format: fps / GPU memory usage (MB)

rife(model=44, 1920x1088)

backend 1 stream 2 streams
ort-cuda 53.62/1771 83.34/2748
trt 71.30/ 626 107.3/ 962

dpir color

backend 1 stream 2 streams
ort-cuda 4.64/3230
trt 10.32/1992 11.61/3475

waifu2x upconv_7

backend 1 stream 2 streams
ort-cuda 11.07/5916 15.04/10899
trt 18.38/2092 31.64/ 3848

waifu2x cunet

backend 1 stream 2 streams
ort-cuda 4.63/8541 5.32/16148
trt 11.44/4771 15.59/ 8972

realesrgan v2/v3

backend 1 stream 2 streams
ort-cuda 8.84/2283 11.10/4202
trt 14.59/1324 21.37/2174