Tesla p4 and m60 forced into low vram mode [Bug]: #2661

jhemley · 2024-03-29T02:38:42Z

Checklist

The issue has not been resolved by following the troubleshooting guide
The issue exists on a clean installation of Fooocus
The issue exists in the current version of Fooocus
The issue has not been reported before recently
The issue has been reported before but has not been fixed yet

What happened?

I have a tesla m60 and p4 running in a linux vm (same problem occured on windows) ive tried running them but it always runs in low vram mode.

Steps to reproduce the problem

run conda activate fooocus
python entry_with_update.py --listen

What should have happened?

I think it shouldnt run in low vram mode(correct me if im wrong) it runs just fine on my 2080maxq but has these lowvram problems on the tesla cards i have tested with.

What browsers do you use to access Fooocus?

Mozilla Firefox

Where are you running Fooocus?

Locally with virtualization (e.g. Docker)

What operating system are you using?

ubuntu20.4 and windows 10

Console logs

python entry_with_update.py --always-normal-vram --listen
Already up-to-date
Update succeeded.
[System ARGV] ['entry_with_update.py', '--always-normal-vram', '--listen']
Python 3.10.14 (main, Mar 21 2024, 16:24:04) [GCC 11.2.0]
Fooocus version: 2.3.1
[Cleanup] Attempting to delete content of temp dir /tmp/fooocus
[Cleanup] Cleanup successful
Total VRAM 8116 MB, total RAM 64308 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 Tesla P4 : native
VAE dtype: torch.float32
Using pytorch cross attention
Refiner unloaded.
Running on local URL:  http://0.0.0.0:7865

To create a public link, set `share=True` in `launch()`.
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Base model loaded: /home/jared/Fooocus/models/checkpoints/juggernautXL_v8Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/home/jared/Fooocus/models/checkpoints/juggernautXL_v8Rundiffusion.safetensors].
Loaded LoRA [/home/jared/Fooocus/models/loras/sd_xl_offset_example-lora_1.0.safetensors] for UNet [/home/jared/Fooocus/models/checkpoints/juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Started worker with PID 2184
App started successful. Use the app with http://localhost:7865/ or 0.0.0.0:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 59858353226061117
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] make a picture of a lambo, cinematic, phenomenal, creative, dynamic, dramatic, thought, epic, elegant, intricate, detailed, extremely light, shining, complimentary colors, shiny, glowing, winning, grand elaborate complex, highly decorated, open flowing, deep color, very beautiful, symmetry, great composition, atmosphere, perfect, artistic, innocent, inspiring, unique
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] make a picture of a lambo, detailed, elegant, holy, impressive, noble, gorgeous, amazing, fancy, dramatic, colorful, very inspirational, beautiful, illuminated background, epic composition, magical atmosphere, cinematic, symmetry, pure, solid colors, extremely, highly complex, determined, imposing, futuristic, professional, artistic, creative, vibrant, fine detail, color
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 9.55 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 5363.8427734375
[Fooocus Model Management] Moving model(s) has taken 9.02 seconds
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [04:40<00:00,  9.35s/it]
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.39 seconds
Image generated with private log at: /home/jared/Fooocus/outputs/2024-03-29/log.html
Generating and saving time: 294.14 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 5331.772085189819
[Fooocus Model Management] Moving model(s) has taken 8.64 seconds
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [04:44<00:00,  9.49s/it]
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.33 seconds
Image generated with private log at: /home/jared/Fooocus/outputs/2024-03-29/log.html
Generating and saving time: 297.74 seconds
Total time: 601.53 seconds

Additional information

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2184 C python 6456MiB |
+----------------------------------------------------------------------------
i have also tried 550

mashb1t · 2024-03-31T14:37:31Z

As you can see in https://github.com/lllyasviel/Fooocus/blob/main/ldm_patched/modules/model_management.py#L429-L430 the trigger for lowvram mode is model_size > (current_free_mem - inference_memory).
Please check the model size and debug the other parameters in given code by adding a breakpoint and using python debugger or by prompting the values to further debug. Thanks!

jhemley · 2024-03-31T14:40:53Z

Is there anyway to simply force the model because i belive i have enough vram. I tried the force normal and high vram but they didnt work.

mashb1t · 2024-03-31T14:53:05Z

Please check the model size and debug the other parameters in given code by adding a breakpoint and using python debugger or by prompting the values to further debug. Thanks!

Please debug this yourself and provide further information.

jhemley · 2024-03-31T16:47:26Z

I think i located the problem. i think the telsa driver limits the vram usage to 8102 MiB instead of the 8192 on the card. I found this by disabling the lowvram mode by changing the param to this model_size > (99999999999999).
now it outputs this
python entry_with_update.py --listen
Already up-to-date
Update succeeded.
[System ARGV] ['entry_with_update.py', '--listen']
Python 3.10.14 (main, Mar 21 2024, 16:24:04) [GCC 11.2.0]
Fooocus version: 2.3.1
[Cleanup] Attempting to delete content of temp dir /tmp/fooocus
[Cleanup] Cleanup successful
Total VRAM 8123 MB, total RAM 32100 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 Tesla M60 : native
VAE dtype: torch.float32
Using pytorch cross attention
Refiner unloaded.
Running on local URL: http://0.0.0.0:7865

Thanks for being a Gradio user! If you have questions or feedback, please join our Discord server and chat with us: https://discord.gg/feTf9x3ZSB

To create a public link, set share=True in launch().
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: /home/jared/Fooocus/models/checkpoints/juggernautXL_v8Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/home/jared/Fooocus/models/checkpoints/juggernautXL_v8Rundiffusion.safetensors].
Loaded LoRA [/home/jared/Fooocus/models/loras/sd_xl_offset_example-lora_1.0.safetensors] for UNet [/home/jared/Fooocus/models/checkpoints/juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Started worker with PID 1658
App started successful. Use the app with http://localhost:7865/ or 0.0.0.0:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 291429156536229784
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] make a car, bright colors, elegant, highly detailed, sharp focus, beautiful, intricate, cinematic, new classic, sunny, shining, deep aesthetic, appealing, artistic, fine detail, awesome color, dynamic light, great composition, clear professional background, creative, innocent, scenic, positive, unique, attractive, cute, perfect, focused, vibrant, epic, best
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] make a car, expressive, dynamic composition, dramatic, elegant, highly detailed, sharp focus, beautiful, perfect light, attractive, innocent, divine, sublime, epic, stunning, inspired, vibrant, intricate, brilliant, thought, cinematic, background, illuminated, professional, best, creative, winning, romantic, fantastic, scenic, artistic, fabulous, bright, hopeful, cute
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 12.90 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
ERROR diffusion_model.output_blocks.1.1.transformer_blocks.9.ff.net.0.proj.weight CUDA out of memory. Tried to allocate 50.00 MiB. GPU 0 has a total capacity of 7.93 GiB of which 17.62 MiB is free. Including non-PyTorch memory, this process has 7.91 GiB memory in use. Of the allocated memory 7.53 GiB is allocated by PyTorch, and 306.44 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Traceback (most recent call last):
File "/home/jared/Fooocus/modules/async_worker.py", line 913, in worker
handler(task)
File "/home/jared/anaconda3/envs/fooocus/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/jared/anaconda3/envs/fooocus/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/jared/Fooocus/modules/async_worker.py", line 816, in handler
imgs = pipeline.process_diffusion(
File "/home/jared/anaconda3/envs/fooocus/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/jared/anaconda3/envs/fooocus/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/jared/Fooocus/modules/default_pipeline.py", line 362, in process_diffusion
sampled_latent = core.ksampler(
File "/home/jared/anaconda3/envs/fooocus/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/jared/anaconda3/envs/fooocus/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/jared/Fooocus/modules/core.py", line 308, in ksampler
samples = ldm_patched.modules.sample.sample(model,
File "/home/jared/Fooocus/ldm_patched/modules/sample.py", line 93, in sample
real_model, positive_copy, negative_copy, noise_mask, models = prepare_sampling(model, noise.shape, positive, negative, noise_mask)
File "/home/jared/Fooocus/ldm_patched/modules/sample.py", line 86, in prepare_sampling
ldm_patched.modules.model_management.load_models_gpu([model] + models, model.memory_required([noise_shape[0] * 2] + list(noise_shape[1:])) + inference_memory)
File "/home/jared/Fooocus/modules/patch.py", line 447, in patched_load_models_gpu
y = ldm_patched.modules.model_management.load_models_gpu_origin(*args, **kwargs)
File "/home/jared/Fooocus/ldm_patched/modules/model_management.py", line 437, in load_models_gpu
cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
File "/home/jared/Fooocus/ldm_patched/modules/model_management.py", line 304, in model_load
raise e
File "/home/jared/Fooocus/ldm_patched/modules/model_management.py", line 300, in model_load
self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU
File "/home/jared/Fooocus/ldm_patched/modules/model_patcher.py", line 199, in patch_model
temp_weight = ldm_patched.modules.model_management.cast_to_device(weight, device_to, torch.float32, copy=True)
File "/home/jared/Fooocus/ldm_patched/modules/model_management.py", line 615, in cast_to_device
return tensor.to(device, copy=copy, non_blocking=non_blocking).to(dtype, non_blocking=non_blocking)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB. GPU 0 has a total capacity of 7.93 GiB of which 17.62 MiB is free. Including non-PyTorch memory, this process has 7.91 GiB memory in use. Of the allocated memory 7.53 GiB is allocated by PyTorch, and 306.79 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Total time: 16.25 seconds
nvidia smi shows this
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla M60 Off | 00000000:0B:00.0 Off | Off |
| N/A 43C P0 39W / 150W | 8105MiB / 8192MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1658 C python 8102MiB |
+-----------------------------------------------------------------------------+

jhemley · 2024-03-31T16:50:02Z

Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\entry_with_update.py']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.3.1
[Cleanup] Attempting to delete content of temp dir C:\Users\hemle\AppData\Local\Temp\fooocus
[Cleanup] Cleanup successful
Total VRAM 8192 MB, total RAM 65397 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 NVIDIA GeForce RTX 2080 with Max-Q Design : native
VAE dtype: torch.float32
Using pytorch cross attention
Refiner unloaded.
Running on local URL: http://127.0.0.1:7865

To create a public link, set share=True in launch().
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
Base model loaded: C:\Users\hemle\Downloads\Fooocus_win64_2-1-831\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [C:\Users\hemle\Downloads\Fooocus_win64_2-1-831\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors].
Loaded LoRA [C:\Users\hemle\Downloads\Fooocus_win64_2-1-831\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [C:\Users\hemle\Downloads\Fooocus_win64_2-1-831\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 0.62 seconds
Started worker with PID 12820
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 3296201712917260942
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] make a car, cinematic, dynamic, dramatic ambient light, detailed, intricate, elegant, highly saturated colors, strong, epic, stunning, heroic, amazing detail, creative, positive, attractive, cute, beautiful, confident, inspired, pretty, perfect, coherent, trendy, best, awesome, futuristic, cool, inspirational, vibrant, loving, full, color, complex
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] make a car, colorful, vivid, detailed, breathtaking, beautiful, emotional, shiny, shining, highly detail, amazing, flowing, light, complex, color, surreal, ambient, pristine, dynamic, symmetry, sharp focus, epic, fine, very strong, winning, perfect, artistic, innocent, confident, attractive, incredible, creative, positive, unique, loving
[Fooocus] Encoding positive #1 ...
[Fooocus Model Management] Moving model(s) has taken 0.15 seconds
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 3.90 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 2.06 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [00:23<00:00, 1.30it/s]
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 0.20 seconds
Image generated with private log at: C:\Users\hemle\Downloads\Fooocus_win64_2-1-831\Fooocus\outputs\2024-03-31\log.html
Generating and saving time: 27.49 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.37 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [00:23<00:00, 1.30it/s]
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 0.18 seconds
Image generated with private log at: C:\Users\hemle\Downloads\Fooocus_win64_2-1-831\Fooocus\outputs\2024-03-31\log.html
Generating and saving time: 26.77 seconds
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Total time: 63.45 seconds
[Fooocus Model Management] Moving model(s) has taken 0.59 seconds
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 208600173302938237
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] make a car, cool color, perfect shiny deep background, sharp focus, intricate, elegant, highly detailed, dramatic light, professional still, dynamic composition, ambient atmosphere, vivid colors, beautiful, epic, stunning, creative, cinematic, fine detail, full clear, great quality, attractive, cheerful, novel, romantic, scenic, rich, hopeful, cute, radiant, colorful
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] make a car, colorful, shiny, vivid, detailed, amazing, flowing, infinite, light, color, epic, atmosphere, new, dynamic, ambient, cinematic, elegant, intricate, highly focused, creative, pure, artistic, romantic, sunny, beautiful, deep, unique, vibrant, coherent, colors, perfect, illuminated, pretty, clear, shining, flawless
[Fooocus] Encoding positive #1 ...
[Fooocus Model Management] Moving model(s) has taken 0.13 seconds
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 3.24 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 2.20 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [00:22<00:00, 1.33it/s]
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 0.23 seconds
Image generated with private log at: C:\Users\hemle\Downloads\Fooocus_win64_2-1-831\Fooocus\outputs\2024-03-31\log.html
Generating and saving time: 27.06 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.36 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [00:23<00:00, 1.30it/s]
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 0.18 seconds
Image generated with private log at: C:\Users\hemle\Downloads\Fooocus_win64_2-1-831\Fooocus\outputs\2024-03-31\log.html
Generating and saving time: 26.80 seconds
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Total time: 57.17 seconds
[Fooocus Model Management] Moving model(s) has taken 0.58 seconds

jhemley · 2024-03-31T21:46:31Z

It looks like i need to some how shave 100 mib of vram off the program is there anyway to run the gppt2 part on CPU?

mashb1t · 2024-03-31T21:48:19Z

In general yes, but please check first if it works with disabled Fooocus V2 style

jhemley · 2024-03-31T21:49:07Z

ok ill try that

jhemley · 2024-03-31T21:51:56Z

i disabled Fooocus V2 style but still same error occured

jhemley · 2024-03-31T21:55:01Z

from my testing i belive with the tesla drivers for the m60 and p4 it limits the max vram to 8094 Mib instead of 8192

jhemley · 2024-03-31T21:55:41Z

i disabled Fooocus V2 style but still same error occured

**mashb1t ** commented Mar 31, 2024

mashb1t · 2024-04-01T17:23:27Z

So this issue can be closed as this is a driver issue with your cards?

jhemley · 2024-04-01T17:25:34Z

is there a way to change the gpt 2 model to run on CPU or another gpu to limit vram? Also it could still be a bug I am not sure because the behavior is odd. It runs on my 2080 maxq without ever filling the GPU to more than 7 gib but on the teslas it initial tries to fill the vram to 8 gib which fails as I belive they are limited to 8100 mib

mashb1t · 2024-04-01T17:41:46Z

You can force it to be on CPU by setting

Fooocus/extras/expansion.py

Line 65 in e2f9bcb

load_device = model_management.text_encoder_device()

to torch.device("cpu") or add a line in

Fooocus/ldm_patched/modules/model_management.py

Lines 526 to 537 in e2f9bcb

    
           def text_encoder_device(): 
        
               if args.always_gpu: 
        
                   return get_torch_device() 
        
               elif vram_state == VRAMState.HIGH_VRAM or vram_state == VRAMState.NORMAL_VRAM: 
        
                   if is_intel_xpu(): 
        
                       return torch.device("cpu") 
        
                   if should_use_fp16(prioritize_performance=False): 
        
                       return get_torch_device() 
        
                   else: 
        
                       return torch.device("cpu") 
        
               else: 
        
                   return torch.device("cpu")

to always return torch.device("cpu")

But keep in mind that prompt expansion is only used when setting style Fooocus V2, so this might not be the right place to begin with.

jhemley · 2024-04-01T17:43:49Z

sed when setting style Fooocus V2, so this might

this would only make the text model run on CPU not the image model correct?

mashb1t · 2024-04-01T17:55:01Z

yes

jhemley · 2024-04-01T18:28:25Z

I tried that, but it didn't really work. Now that you are aware of this problem, is it possible that there are any plans in the future to try and trim the VRAM requirements by about 200 mib to allow them to run on Tesla 8 GB GPus?

mashb1t · 2024-04-01T18:32:07Z

No plans for in-depth testing on P4 and M60 cards, works on 4GB VRAM and must be an issue with your driver reporting wrong numbers.

jhemley · 2024-04-01T18:43:57Z

yeah that sucks. But i did just buy a tesla m40 which has 24gb vram so hopefully that works. Last question: are there any possibilities of adding multi-GPU support like what Ollama has?

mashb1t · 2024-04-01T18:48:46Z

See #2292

What you can do is to start multiple instances of Fooocus instead.

CultusMechanicus · 2024-05-08T21:27:52Z

This is a weird driver setting issue with P4s. By default, it runs ECC memory. Disable the ECC RAM with "nvidia-smi -e 0". That should release the full 8GB of VRAM.

jhemley added bug Something isn't working triage This needs an (initial) review labels Mar 29, 2024

mashb1t added feedback pending Waiting for further information and removed triage This needs an (initial) review labels Mar 31, 2024

mashb1t closed this as not planned Won't fix, can't repro, duplicate, stale Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tesla p4 and m60 forced into low vram mode [Bug]: #2661

Tesla p4 and m60 forced into low vram mode [Bug]: #2661

jhemley commented Mar 29, 2024 •

edited

Loading

mashb1t commented Mar 31, 2024

jhemley commented Mar 31, 2024 •

edited

Loading

mashb1t commented Mar 31, 2024

jhemley commented Mar 31, 2024

jhemley commented Mar 31, 2024 •

edited

Loading

jhemley commented Mar 31, 2024

mashb1t commented Mar 31, 2024

jhemley commented Mar 31, 2024

jhemley commented Mar 31, 2024

jhemley commented Mar 31, 2024

jhemley commented Mar 31, 2024

mashb1t commented Apr 1, 2024

jhemley commented Apr 1, 2024 •

edited

Loading

mashb1t commented Apr 1, 2024 •

edited

Loading

jhemley commented Apr 1, 2024

mashb1t commented Apr 1, 2024

jhemley commented Apr 1, 2024

mashb1t commented Apr 1, 2024

jhemley commented Apr 1, 2024

mashb1t commented Apr 1, 2024

CultusMechanicus commented May 8, 2024

Tesla p4 and m60 forced into low vram mode [Bug]: #2661

Tesla p4 and m60 forced into low vram mode [Bug]: #2661

Comments

jhemley commented Mar 29, 2024 • edited Loading

Checklist

What happened?

Steps to reproduce the problem

What should have happened?

What browsers do you use to access Fooocus?

Where are you running Fooocus?

What operating system are you using?

Console logs

Additional information

mashb1t commented Mar 31, 2024

jhemley commented Mar 31, 2024 • edited Loading

mashb1t commented Mar 31, 2024

jhemley commented Mar 31, 2024

jhemley commented Mar 31, 2024 • edited Loading

jhemley commented Mar 31, 2024

mashb1t commented Mar 31, 2024

jhemley commented Mar 31, 2024

jhemley commented Mar 31, 2024

jhemley commented Mar 31, 2024

jhemley commented Mar 31, 2024

mashb1t commented Apr 1, 2024

jhemley commented Apr 1, 2024 • edited Loading

mashb1t commented Apr 1, 2024 • edited Loading

jhemley commented Apr 1, 2024

mashb1t commented Apr 1, 2024

jhemley commented Apr 1, 2024

mashb1t commented Apr 1, 2024

jhemley commented Apr 1, 2024

mashb1t commented Apr 1, 2024

CultusMechanicus commented May 8, 2024

jhemley commented Mar 29, 2024 •

edited

Loading

jhemley commented Mar 31, 2024 •

edited

Loading

jhemley commented Mar 31, 2024 •

edited

Loading

jhemley commented Apr 1, 2024 •

edited

Loading

mashb1t commented Apr 1, 2024 •

edited

Loading