Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory error on PC #780

Closed
jdmoser1 opened this issue Oct 24, 2023 · 7 comments
Closed

Out of memory error on PC #780

jdmoser1 opened this issue Oct 24, 2023 · 7 comments

Comments

@jdmoser1
Copy link

Hello. I just installed Fooocus, let it download the SDXL models, and did my first test run. It failed to complete the run with the message:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 14.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 3.55 GiB is free. Of the allocated memory 1.36 GiB is allocated by PyTorch, and 77.12 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I'm running on a Desktop with 8GB RAM and a NVIDIA GeForce RTX 2060 with 6GB VRAM. I was able to get SDXL to work before on Automatic1111, but the model was a slightly different version.

@lllyasviel
Copy link
Owner

show full log?

@jdmoser1
Copy link
Author

where should i look for the log?

@jdmoser1
Copy link
Author

This is the output in the console:

.\python_embeded\python.exe -s Fooocus\entry_with_update.py
Already up-to-date
Update succeeded.
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.739
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
Total VRAM 6144 MB, total RAM 8109 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 2060 : native
VAE dtype: torch.float32
Using pytorch cross attention
[Fooocus] Disabling smart memory
model_type EPS
adm 2560
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
missing {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Refiner model loaded: C:\Users\justin\AppData\Roaming\Fooocus\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors
model_type EPS
adm 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
missing {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Base model loaded: C:\Users\justin\AppData\Roaming\Fooocus\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors
LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)]
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 1.20 seconds
App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 7.0
[Parameters] Seed = 8375762911681427531
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] New suffix: intricate, elegant, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, art by artgerm and greg rutkowski and alphonse mucha and william - adolphe bouguereau
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: extremely detailed painting by dustin nguyen, akihiko yoshida, greg tocchini, greg rutkowski, cliff chiang, 4 k resolution, trending on artstation
[Fooocus] Encoding positive #1 ...
[Fooocus Model Management] Moving model(s) has taken 0.29 seconds
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
Preparation time: 5.99 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.02916753850877285, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3124.6401739120483
Traceback (most recent call last):
File "C:\Users\justin\AppData\Roaming\Fooocus\Fooocus\modules\async_worker.py", line 584, in worker
handler(task)
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\justin\AppData\Roaming\Fooocus\Fooocus\modules\async_worker.py", line 517, in handler
imgs = pipeline.process_diffusion(
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\justin\AppData\Roaming\Fooocus\Fooocus\modules\default_pipeline.py", line 352, in process_diffusion
sampled_latent = core.ksampler(
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\justin\AppData\Roaming\Fooocus\Fooocus\modules\core.py", line 263, in ksampler
samples = fcbh.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
File "C:\Users\justin\AppData\Roaming\Fooocus\Fooocus\backend\headless\fcbh\sample.py", line 90, in sample
real_model, positive_copy, negative_copy, noise_mask, models = prepare_sampling(model, noise.shape, positive, negative, noise_mask)
File "C:\Users\justin\AppData\Roaming\Fooocus\Fooocus\backend\headless\fcbh\sample.py", line 81, in prepare_sampling
fcbh.model_management.load_models_gpu([model] + models, fcbh.model_management.batch_area_memory(noise_shape[0] * noise_shape[2] * noise_shape[3]) + inference_memory)
File "C:\Users\justin\AppData\Roaming\Fooocus\Fooocus\modules\patch.py", line 464, in patched_load_models_gpu
y = fcbh.model_management.load_models_gpu_origin(*args, **kwargs)
File "C:\Users\justin\AppData\Roaming\Fooocus\Fooocus\backend\headless\fcbh\model_management.py", line 406, in load_models_gpu
cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
File "C:\Users\justin\AppData\Roaming\Fooocus\Fooocus\backend\headless\fcbh\model_management.py", line 294, in model_load
accelerate.dispatch_model(self.real_model, device_map=device_map, main_device=self.device)
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\accelerate\big_modeling.py", line 371, in dispatch_model
attach_align_device_hook_on_blocks(
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\accelerate\hooks.py", line 536, in attach_align_device_hook_on_blocks
attach_align_device_hook_on_blocks(
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\accelerate\hooks.py", line 536, in attach_align_device_hook_on_blocks
attach_align_device_hook_on_blocks(
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\accelerate\hooks.py", line 506, in attach_align_device_hook_on_blocks
add_hook_to_module(module, hook)
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\accelerate\hooks.py", line 155, in add_hook_to_module
module = hook.init_hook(module)
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\accelerate\hooks.py", line 253, in init_hook
set_module_tensor_to_device(module, name, self.execution_device)
File "C:\Users\justin\AppData\Roaming\Fooocus\python_embeded\lib\site-packages\accelerate\utils\modeling.py", line 292, in set_module_tensor_to_device
new_value = old_value.to(device)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 2.62 GiB is free. Of the allocated memory 2.25 GiB is allocated by PyTorch, and 107.39 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Total time: 168.76 seconds

@lllyasviel
Copy link
Owner

hello this should be caused by system RAM too old and does not support transformer accelerator's hooks.
2060 is supported by Fooocus, and well tested, and should work if system RAM is in order.
if Automatic1111 works on this device then probably the best thing to do is to keep using Automatic1111.

@jdmoser1
Copy link
Author

okay. I may do that. Is there no available workaround for the RAM issue?

@FlatMapIO
Copy link

FlatMapIO commented Oct 31, 2023

In my test #792, starting a system with 16GB or less of CPU memory may be a problem.

@lllyasviel
Copy link
Owner

lllyasviel commented Oct 31, 2023

Hi all, fooocus can run in 12GB RAM in linux without swap. see also the colab demo (linux, 12GB RAM, no swap, T4 GPU) in Readme. But this may also related to many complicated factors like CPU arch, etc

@mashb1t mashb1t closed this as completed Dec 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants