-
-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of memory #46
Comments
There isn't a lot of stack trace there. Do you have some more? Also it would help to know which version didn't have this issue. Normally it's not supposed to fill up system RAM at all, but recent NVIDIA drivers have a somewhat unreliable system memory swap feature that might have been doing that? Really hard to debug without more information. |
I don't have much more information to provide, as when it loads and then fails, I only receive this error. I haven't used exui in a few months, but I recently updated everything, including every dependency. I'm also on the Nvidia driver version 555.41, if that helps. I've experimented a bit with the system memory fallback in the control panel, but sadly to no avail. |
Don't you get more text with the error message than that? There's usually a stack trace to show which function raised the exception. |
The latest Windows update seems to have resolved the out-of-memory issue I was experiencing. However, it has also significantly slowed down the inference performance. Before the update: Inference speed: 15-20 tokens per second (t/s) After the update: Inference speed: 1-2 tokens per second (t/s) Unfortunately, I don't have a stack trace available that I could provide for further analysis of the performance issue. The slowdown seems to be a direct consequence of the recent Windows update, as no other changes were made to the system configuration. |
For reasons I can't pinpoint, the server.py file now only opens a blank command line after restarting my PC again. Despite reinstalling Python, Torch, and Exllama, the issue persists. Interestingly, Comfyui starts up without a hitch, but the server.py within Exui is problematic and remains inactive. |
I've been trying to troubleshoot the server.py issue by following various steps, such as creating a new virtual environment, installing the required dependencies, and setting the CUDA_HOME environment variable. However, despite these efforts, the issue persists. |
As far as I can tell this is an issue with the PyTorch extension build system. Best advice I can give is to clear out your ~/.cache/torch_extensions directory, delete any ./build/ folder in ExLlama's repo directory and try again. There's one more thing that's a bit of a long shot, but I have had to do it when profiling in nsys, since it seems to get confused as to what venv to use: uninstall the If none of this works there's also the prebuilt wheels of course. |
Thanks for the suggestions! I've tried clearing out the ~/.cache/torch_extensions directory, but it turns out I don't have that folder on my system. I wasn't sure if I should create it manually, but I figured PyTorch would probably create it automatically if needed. |
The file_baton stuff is a mechanism used by Torch during the extension build process. See here. It shouldn't have anything to wait for if there isn't a ~/.cache/torch_extensions folder at all, so it's very strange. Could you list the output of: pip show torch ninja exllamav2
nvcc --version
gcc --version |
C:\Users\timoe\exui>pip show torch ninja exllamav2
|
I had somehow assumed you were on Linux. In that case you want to look for the torch_extensions folder in The exllamav2 version you have installed there is the JIT version, just to be clear. It shouldn't ever use the Torch extension build system if you use a prebuilt wheel instead. Have you tried this: pip uninstall exllamav2
pip install https://github.com/turboderp/exllamav2/releases/download/v0.0.19/exllamav2-0.0.19+cu121-cp310-cp310-win_amd64.whl |
I've followed these steps and now get this error: C:\Users\timoe\exui>python server.py |
PyTorch 2.3 was released yesterday. I haven't tried it yet myself, but I do know that the Torch people like to completely break backwards compatibility with every new release. I'll be releasing 0.0.20 soon with prebuilt wheels compiled against Torch 2.3.0 (if that works, hard to predict) but in the meantime you'd probably have more luck downgrading to PyTorch 2.2.0. |
i've downgraded to PyTorch 2.2.0 but the same error somehow persists: Installing collected packages: torch C:\Users\timoe\exui>python server.py C:\Users\timoe\exui> |
There's something really weird about your setup, I think. The failure happens here: build_jit = False
try:
import exllamav2_ext
except ModuleNotFoundError:
build_jit = True You're somehow getting an I think the implication is that the prebuilt extension has been installed, but there's some conflict, perhaps with another library called
|
I attempted the solution and it seems to have worked. I've tried reinstalling everything in my main path, but apparently, there may have been some issues. |
I have an issue where before in an older version first the GPU VRAM fills up and then the system RAM after that. Since I updated both fills up at the same time causing this error: RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.The text was updated successfully, but these errors were encountered: