-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tesla p4 and m60 forced into low vram mode [Bug]: #2661
Comments
As you can see in https://github.com/lllyasviel/Fooocus/blob/main/ldm_patched/modules/model_management.py#L429-L430 the trigger for lowvram mode is |
Is there anyway to simply force the model because i belive i have enough vram. I tried the force normal and high vram but they didnt work. |
Please debug this yourself and provide further information. |
I think i located the problem. i think the telsa driver limits the vram usage to 8102 MiB instead of the 8192 on the card. I found this by disabling the lowvram mode by changing the param to this model_size > (99999999999999). Thanks for being a Gradio user! If you have questions or feedback, please join our Discord server and chat with us: https://discord.gg/feTf9x3ZSB To create a public link, set +-----------------------------------------------------------------------------+ |
but i still dont understand why it works on my 2080 maxq because it only utilizes about 6785 mib. but it runs just fine. here's that report Already up-to-date To create a public link, set |
It looks like i need to some how shave 100 mib of vram off the program is there anyway to run the gppt2 part on CPU? |
In general yes, but please check first if it works with disabled Fooocus V2 style |
ok ill try that |
i disabled Fooocus V2 style but still same error occured |
from my testing i belive with the tesla drivers for the m60 and p4 it limits the max vram to 8094 Mib instead of 8192 |
|
So this issue can be closed as this is a driver issue with your cards? |
is there a way to change the gpt 2 model to run on CPU or another gpu to limit vram? Also it could still be a bug I am not sure because the behavior is odd. It runs on my 2080 maxq without ever filling the GPU to more than 7 gib but on the teslas it initial tries to fill the vram to 8 gib which fails as I belive they are limited to 8100 mib |
You can force it to be on CPU by setting Line 65 in e2f9bcb
torch.device("cpu") or add a line in Fooocus/ldm_patched/modules/model_management.py Lines 526 to 537 in e2f9bcb
torch.device("cpu")
But keep in mind that prompt expansion is only used when setting style Fooocus V2, so this might not be the right place to begin with. |
this would only make the text model run on CPU not the image model correct? |
yes |
I tried that, but it didn't really work. Now that you are aware of this problem, is it possible that there are any plans in the future to try and trim the VRAM requirements by about 200 mib to allow them to run on Tesla 8 GB GPus? |
No plans for in-depth testing on P4 and M60 cards, works on 4GB VRAM and must be an issue with your driver reporting wrong numbers. |
yeah that sucks. But i did just buy a tesla m40 which has 24gb vram so hopefully that works. Last question: are there any possibilities of adding multi-GPU support like what Ollama has? |
See #2292 What you can do is to start multiple instances of Fooocus instead. |
This is a weird driver setting issue with P4s. By default, it runs ECC memory. Disable the ECC RAM with "nvidia-smi -e 0". That should release the full 8GB of VRAM. |
Checklist
What happened?
I have a tesla m60 and p4 running in a linux vm (same problem occured on windows) ive tried running them but it always runs in low vram mode.
Steps to reproduce the problem
run conda activate fooocus
python entry_with_update.py --listen
What should have happened?
I think it shouldnt run in low vram mode(correct me if im wrong) it runs just fine on my 2080maxq but has these lowvram problems on the tesla cards i have tested with.
What browsers do you use to access Fooocus?
Mozilla Firefox
Where are you running Fooocus?
Locally with virtualization (e.g. Docker)
What operating system are you using?
ubuntu20.4 and windows 10
Console logs
Additional information
current version
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla P4 Off | 00000000:0B:00.0 Off | Off |
| N/A 75C P0 41W / 75W | 6458MiB / 8192MiB | 47% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2184 C python 6456MiB |
+----------------------------------------------------------------------------
i have also tried 550
The text was updated successfully, but these errors were encountered: