-
Notifications
You must be signed in to change notification settings - Fork 20
CUGAN
WolframRhodium edited this page Mar 15, 2023
·
19 revisions
Real-CUGAN is a super-resolution neural network for anime-style arts, based on the waifu2x-cunet network and trained by bilibili on millions of anime images with a RealESRGANv2-like approach.
Link:
- (stable) https://github.com/AmusementClub/vs-mlrt/releases/download/model-20211209/cugan_v2.7z
The models support upscaling by 2x/3x/4x and also denoising.
-
scale
: 2 or 3 or 4 -
noise
: -1, 0, 1, 2, 3 (like waifu2x), 1/2 is only supported byscale=2
.
In order to simplify usage, we provided a Python wrapper module vsmlrt (release v7 or above).
from vsmlrt import CUGAN, Backend
src = core.std.BlankClip(format=vs.RGBS) # only supports RGBS input formats
# clamp src to be safe as out of range values will produce large negative output.
src = core.akarin.Expr(src, "x 0 1 clamp")
# backend could be:
# - CPU Backend.OV_CPU(): the recommended CPU backend; generally faster than ORT-CPU.
# - CPU Backend.ORT_CPU(num_streams=1, verbosity=2): vs-ort cpu backend.
# - GPU Backend.ORT_CUDA(device_id=0, cudnn_benchmark=True, num_streams=1, verbosity=2)
# - use device_id to select device
# - set cudnn_benchmark=False to reduce script reload latency when debugging, but with slight throughput performance penalty.
# - GPU Backend.TRT(fp16=True, device_id=0, num_streams=1): TensorRT runtime, the fastest NV GPU runtime.
flt = CUGAN(src, noise=-1, scale=2, backend=Backend.ORT_CUDA())
- Make sure your RGBS input to CUGAN is within [0,1] range. Out of range values will trip the NN into producing large negative values.
Measurements: FPS / Device Memory (MB)
Device memory:
- CPU: private memory including VapourSynth
- GPU: device memory including context
Software: VapourSynth R57, Windows 10 LTSC 2021, Graphics Driver 511.23.
Input size: 1920x1080
- vs-mlrt v7
- Real-CUGAN
7e77b85
- vs-mlrt v8 (driver 511.79)
Model | [1] ort-cuda | [2] pytorch | [3] ort-cuda |
---|---|---|---|
2x | 3.30 / 10445 | 2.36 / 20076 | 3.24 / 10251 |
3x (540p patch) | 1.52 / 9978 | 0.77 / 19304 | |
4x | 1.96 / 18377 | 1.25 / 22353 | 1.93 / 18183 |
Model | [1] ort-cuda | [2] pytorch | [3] ort-cuda |
---|---|---|---|
2x | 4.27 / 10185 | 3.29 / 12258 | 4.40 / 9991 |
3x | 1.61 / 19007 | 1.55 / 21816 | 1.62 / 23442 |
4x | 2.30 / 10181 | 1.43 / 13616 | 2.40 / 9987 |
Software: VapourSynth R57-A4, Windows Server 2022, Graphics Driver 516.94.
Input size: 1920x1080
- vs-mlrt v9
Model | [1] trt | [1] trt (2 streams) |
---|---|---|
2x | 19.4 / 4647 | 26.9 / 8558 |
Hardware: EPYC Milan 32C64T @2.55 GHz
Software: VapourSynth R57, Windows Server 2019.
Input size: 1920x1080
- vs-mlrt v7
Model | [1] ov-cpu |
---|---|
2x | 0.20 / 22627 |
3x | 0.094 / 40358 |
4x | 0.18 / 53174 |
- Runtimes
- Models
- Device-specific benchmarks
- NVIDIA GeForce RTX 4090
- NVIDIA GeForce RTX 3090
- NVIDIA GeForce RTX 2080 Ti
- NVIDIA Quadro P6000
- AMD Radeon RX 7900 XTX
- AMD Radeon Pro V620
- AMD Radeon Pro V520
- AMD Radeon VII
- AMD EPYC Zen4
- Intel Core Ultra 7 155H
- Intel Arc A380
- Intel Arc A770
- Intel Data Center GPU Flex 170
- Intel Data Center GPU Max 1100
- Intel Xeon Sapphire Rapids