You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @LukeLIN-web,
I was not able to reproduce this on an RTX 4090. That said, I would also expect it to work on a 2080 Ti, as that GPU is fully supported for 4bit quantization with bitsandbytes.
I suspect your stack trace is not giving the full picture, as we do not use cublasGemmEx in 4bit. This may come from a PyTorch operation. You may get a more clear trace by setting CUDA_LAUNCH_BLOCKING=1 in your environment.
System Info
I am using cuda_12.2, torch 2.1.0a0+29c30b1, bitsandbytes 0.43.3, python 3.10
Driver Version: 535.113.01
NVIDIA GeForce RTX 2080 Ti
Reproduction
Vchitect/Latte#125 (comment)
Expected behavior
https://huggingface.co/docs/bitsandbytes/v0.43.3/installation
What is 4bit quantation GPU requirement?
The text was updated successfully, but these errors were encountered: