Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]Memory Issue when generating images for the second time #602

Closed
captainpd opened this issue Oct 9, 2023 · 5 comments
Closed

[BUG]Memory Issue when generating images for the second time #602

captainpd opened this issue Oct 9, 2023 · 5 comments

Comments

@captainpd
Copy link

When I generate images first time with one image prompt, everything works fine.
However, at the second generation, the GPU memory run out.

Here is the error

`Preparation time: 19.46 seconds
loading new
ERROR diffusion_model.output_blocks.0.1.transformer_blocks.4.ff.net.0.proj.weight CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 22.41 GiB total capacity; 21.52 GiB already allocated; 11.69 MiB free; 22.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
ERROR diffusion_model.output_blocks.0.1.transformer_blocks.5.attn1.to_v.weight CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 22.41 GiB total capacity; 21.56 GiB already allocated; 11.69 MiB free; 22.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
ERROR diffusion_model.output_blocks.0.1.transformer_blocks.5.attn1.to_out.0.weight CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 22.41 GiB total capacity; 21.56 GiB already allocated; 11.69 MiB free; 22.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
File "D:\Repos\Fooocus\modules\async_worker.py", line 551, in worker
handler(task)
File "D:\Repos\Fooocus\venv\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\Repos\Fooocus\venv\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\Repos\Fooocus\modules\async_worker.py", line 460, in handler
comfy.model_management.load_models_gpu([pipeline.final_unet])
File "D:\Repos\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 397, in load_models_gpu
cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Repos\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 286, in model_load
raise e
File "D:\Repos\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 282, in model_load
self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Repos\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_patcher.py", line 161, in patch_model
temp_weight = comfy.model_management.cast_to_device(weight, device_to, torch.float32, copy=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Repos\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 498, in cast_to_device
return tensor.to(device, copy=copy).to(dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 22.41 GiB total capacity; 21.56 GiB already allocated; 11.69 MiB free; 22.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Total time: 209.24 seconds
Keyboard interruption in main thread... closing server.

Process finished with exit code -1
`
At my second attempt to track this error, I add an endpoint. When I clicked generate and it meet the endpoint, I found 6 models have been loaded into memory. May be this is the issue?

image
47760ff1859b15fa74cf9dea3aa17a5

Thanks for help~

@lllyasviel
Copy link
Owner

give full logs from start
also does this always happens or rarely happens

@markuskonojacki
Copy link

markuskonojacki commented Oct 10, 2023

I do not know if the issues are related in a sense that it's a memory leak, but my second generation of pictures is significantly slower as well. Here is a full log of first and second generation back to back:

D:\Fooocus>.\python_embeded\python.exe -s Fooocus\entry_with_update.py
Fast-forward merge
Update succeeded.
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.37
Inference Engine exists and URL is correct.
Inference Engine checkout finished for d1a0abd40b86f3f079b0cc71e49f9f4604831457.
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Total VRAM 16376 MB, total RAM 64829 MB
xformers version: 0.0.20
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4080 : native
VAE dtype: torch.bfloat16
Using xformers cross attention
model_type EPS
adm 2560
Refiner model loaded: D:\Fooocus\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors
model_type EPS
adm 2816
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
missing {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Base model loaded: D:\Fooocus\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors
LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)]
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
loading new
App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 7.0
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] New suffix: extremely detailed, photorealistic, octane render, 8k, unreal engine. art by artgerm and greg rutkowski and alphonse mucha
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: intricate, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, Unreal Engine 5, 8K, art by artgerm and greg rutkowski and alphonse mucha
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
Preparation time: 1.03 seconds
loading new
Moving model to GPU: 1.72 seconds
  0%|                                                                                                                  | 0/30 [00:00<?, ?it/s]D:\Fooocus\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py:594: UserWarning: Should have tb<=t1 but got tb=14.614643096923828 and t1=14.614643.
  warnings.warn(f"Should have {tb_name}<=t1 but got {tb_name}={tb} and t1={self._end}.")
 67%|██████████████████████████████████████████████████████████████████████                                   | 20/30 [00:05<00:02,  3.72it/s]loading new
Refiner Swapped
 93%|██████████████████████████████████████████████████████████████████████████████████████████████████       | 28/30 [00:10<00:00,  2.74it/s]D:\Fooocus\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py:585: UserWarning: Should have ta>=t0 but got ta=0.02916753850877285 and t0=0.029168.
  warnings.warn(f"Should have ta>=t0 but got ta={ta} and t0={self._start}.")
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:10<00:00,  2.79it/s]
Image generated with private log at: D:\Fooocus\Fooocus\outputs\2023-10-10\log.html
Generating and saving time: 12.48 seconds
loading new
 67%|██████████████████████████████████████████████████████████████████████                                   | 20/30 [00:05<00:02,  3.81it/s]Refiner Swapped
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:08<00:00,  3.70it/s]
Image generated with private log at: D:\Fooocus\Fooocus\outputs\2023-10-10\log.html
Generating and saving time: 13.57 seconds
loading new
Total time: 30.27 seconds
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 7.0
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] New suffix: intricate, detailed, volumetric lighting, scenery, digital painting, highly detailed, artstation, sharp focus, illustration, concept art, ruan jia, steve mccurry
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: extremely detailed, photorealistic, octane render, 8k, unreal engine, art by Artgerm and Greg Rutkowski and Alphonse Mucha
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
Preparation time: 0.85 seconds
loading new
Moving model to GPU: 2.55 seconds
 67%|██████████████████████████████████████████████████████████████████████                                   | 20/30 [09:17<04:38, 27.89s/it]Refiner Swapped
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [14:32<00:00, 29.07s/it]
Image generated with private log at: D:\Fooocus\Fooocus\outputs\2023-10-10\log.html
Generating and saving time: 874.75 seconds
loading new
 67%|██████████████████████████████████████████████████████████████████████████████████████████████████████████                                                     | 20/30 [00:05<00:02,  3.68it/s]Refiner Swapped
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:08<00:00,  3.66it/s]
Image generated with private log at: D:\Fooocus\Fooocus\outputs\2023-10-10\log.html
Generating and saving time: 13.56 seconds
loading new
Total time: 893.02 seconds

For good measure I've done it a second time and just reopened the run.bat and started with a different prompt:

D:\Fooocus>.\python_embeded\python.exe -s Fooocus\entry_with_update.py
Already up-to-date
Update succeeded.
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.37
Inference Engine exists and URL is correct.
Inference Engine checkout finished for d1a0abd40b86f3f079b0cc71e49f9f4604831457.
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Total VRAM 16376 MB, total RAM 64829 MB
xformers version: 0.0.20
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4080 : native
VAE dtype: torch.bfloat16
Using xformers cross attention
model_type EPS
adm 2560
Refiner model loaded: D:\Fooocus\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors
model_type EPS
adm 2816
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
missing {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Base model loaded: D:\Fooocus\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors
LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)]
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
loading new
App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 7.0
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] New suffix: extremely detailed, digital painting, in the style of Fenghua Zhong and Ruan Jia and jeremy lipking and Peter Mohrbacher, mystical colors, rim light, beautiful Lighting, 8k, stunning scene, raytracing, octane, trending on artstation
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: intricate, elegant, highly detailed, digital painting, artstation, concept art, addiction, chains, smooth, sharp focus, illustration, art by ilja repin
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
Preparation time: 1.03 seconds
loading new
Moving model to GPU: 1.53 seconds
  0%|                                                                                           | 0/30 [00:00<?, ?it/s]D:\Fooocus\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py:594: UserWarning: Should have tb<=t1 but got tb=14.614643096923828 and t1=14.614643.
  warnings.warn(f"Should have {tb_name}<=t1 but got {tb_name}={tb} and t1={self._end}.")
 67%|██████████████████████████████████████████████████████▋                           | 20/30 [00:05<00:02,  3.63it/s]loading new
Refiner Swapped
 93%|████████████████████████████████████████████████████████████████████████████▌     | 28/30 [00:34<00:07,  3.63s/it]D:\Fooocus\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py:585: UserWarning: Should have ta>=t0 but got ta=0.02916753850877285 and t0=0.029168.
  warnings.warn(f"Should have ta>=t0 but got ta={ta} and t0={self._start}.")
100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [00:42<00:00,  1.42s/it]
Image generated with private log at: D:\Fooocus\Fooocus\outputs\2023-10-10\log.html
Generating and saving time: 48.19 seconds
loading new
 67%|██████████████████████████████████████████████████████▋                           | 20/30 [00:05<00:02,  3.60it/s]Refiner Swapped
100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [00:08<00:00,  3.60it/s]
Image generated with private log at: D:\Fooocus\Fooocus\outputs\2023-10-10\log.html
Generating and saving time: 10.85 seconds
loading new
Total time: 62.95 seconds
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 7.0
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] New suffix: intricate, elegant, highly detailed, digital painting, artstation, concept art, addiction, chains, smooth, sharp focus, illustration, art by ilja repin
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: extremely detailed, digital painting, in the style of Fenghua Zhong and Ruan Jia and jeremy lipking and Peter Mohrbacher, mystical colors, rim light, beautiful Lighting, 8k, stunning scene, raytracing, octane, trending on artstation
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
Preparation time: 0.85 seconds
loading new
Moving model to GPU: 2.68 seconds
 67%|██████████████████████████████████████████████████████▋                           | 20/30 [07:38<02:48, 16.87s/it]Refiner Swapped
100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [11:38<00:00, 23.28s/it]
Image generated with private log at: D:\Fooocus\Fooocus\outputs\2023-10-10\log.html
Generating and saving time: 704.17 seconds
loading new
 67%|██████████████████████████████████████████████████████▋                           | 20/30 [00:05<00:02,  3.56it/s]Refiner Swapped
100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [00:08<00:00,  3.56it/s]
Image generated with private log at: D:\Fooocus\Fooocus\outputs\2023-10-10\log.html
Generating and saving time: 13.78 seconds
loading new
Total time: 722.79 seconds

GPU Mem usage after this and closing fooocus: 3,8/16 GB
starting fooocus: 6,5/16 GB
clicking generate: 15,5/16 GB
after generation stays at: 12,2/16 GB
clicking generate a second time only goes up to: 15/16 GB

This is reproducible for me.

@captainpd
Copy link
Author

give full logs from start also does this always happens or rarely happens

Here is the full log.
`Python 3.11.4 (tags/v3.11.4:d2340ef, Jun 7 2023, 05:45:37) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.37
Inference Engine exists and URL is correct.
Inference Engine checkout finished for d1a0abd40b86f3f079b0cc71e49f9f4604831457.
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
Total VRAM 22945 MB, total RAM 16250 MB
xformers version: 0.0.20
Set vram state to: NORMAL_VRAM
Device: cuda:0 Tesla P40 : native
VAE dtype: torch.float32
Using xformers cross attention
model_type EPS
adm 2560
Refiner model loaded: D:\Repos\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors
model_type EPS
adm 2816
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
missing {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
loaded straight to GPU
loading new
Base model loaded: D:\Repos\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors
LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)]
Fooocus Expansion engine loaded for cuda:0, use_fp16 = False.
loading new
App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 7.0
[Fooocus] Downloading control models ...
[Fooocus] Loading control models ...
missing clip vision: ['vision_model.embeddings.position_ids']
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] New suffix: extremely detailed oil painting, unreal 5 render, rhads, Bruce Pennington, Studio Ghibli, tim hildebrandt, digital art, octane render, beautiful composition, trending on artstation, award-winning photograph, masterpiece
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: intricate, elegant, highly detailed animal monster, digital painting, artstation, concept art, smooth, sharp focus, illustration, art by artgerm and greg rutkowski and alphonse mucha, 8 k
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Fooocus] Image processing ...
loading new
loading new
loading new
Preparation time: 12.02 seconds
loading new
unload clone 5
Moving model to GPU: 10.80 seconds
0%| | 0/30 [00:00<?, ?it/s]D:\Repos\Fooocus\venv\Lib\site-packages\torchsde_brownian\brownian_interval.py:594: UserWarning: Should have tb<=t1 but got tb=14.614643096923828 and t1=14.614643.
warnings.warn(f"Should have {tb_name}<=t1 but got {tb_name}={tb} and t1={self._end}.")
67%|██████████████████████████████████████████████████████▋ | 20/30 [00:44<00:22, 2.21s/it]loading new
Refiner Swapped
93%|████████████████████████████████████████████████████████████████████████████▌ | 28/30 [02:02<00:07, 3.85s/it]D:\Repos\Fooocus\venv\Lib\site-packages\torchsde_brownian\brownian_interval.py:585: UserWarning: Should have ta>=t0 but got ta=0.02916753850877285 and t0=0.029168.
warnings.warn(f"Should have ta>=t0 but got ta={ta} and t0={self._start}.")
100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [02:07<00:00, 4.26s/it]
Image generated with private log at: D:\Repos\Fooocus\outputs\2023-10-10\log.html
Generating and saving time: 131.54 seconds
loading new
0%| | 0/30 [00:00<?, ?it/s]D:\Repos\Fooocus\venv\Lib\site-packages\torchsde_brownian\brownian_interval.py:594: UserWarning: Should have tb<=t1 but got tb=14.614643096923828 and t1=14.614643.
warnings.warn(f"Should have {tb_name}<=t1 but got {tb_name}={tb} and t1={self._end}.")
67%|██████████████████████████████████████████████████████▋ | 20/30 [00:44<00:22, 2.21s/it]Refiner Swapped
93%|████████████████████████████████████████████████████████████████████████████▌ | 28/30 [01:03<00:04, 2.38s/it]D:\Repos\Fooocus\venv\Lib\site-packages\torchsde_brownian\brownian_interval.py:585: UserWarning: Should have ta>=t0 but got ta=0.02916753850877285 and t0=0.029168.
warnings.warn(f"Should have ta>=t0 but got ta={ta} and t0={self._start}.")
100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [01:08<00:00, 2.27s/it]
Image generated with private log at: D:\Repos\Fooocus\outputs\2023-10-10\log.html
Generating and saving time: 101.09 seconds
loading new
Total time: 257.71 seconds
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 7.0
[Fooocus] Downloading control models ...
[Fooocus] Loading control models ...
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] New suffix: intricate, elegant, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, art by artgerm and greg rutkowski and alphonse mucha and william - adolphe bouguereau
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: extremely detailed digital painting, in the style of fenghua zhong and ruan jia and jeremy lipking and peter mohrbacher, mystical colors, rim light, beautiful lighting, 8 k, stunning scene, raytracing, octane, trending on artstation
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Fooocus] Image processing ...
loading new
loading new
loading new
Preparation time: 10.43 seconds
loading new
ERROR diffusion_model.output_blocks.0.1.transformer_blocks.4.ff.net.0.proj.weight CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 22.41 GiB total capacity; 21.52 GiB already allocated; 11.69 MiB free; 22.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
ERROR diffusion_model.output_blocks.0.1.transformer_blocks.5.attn1.to_v.weight CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 22.41 GiB total capacity; 21.56 GiB already allocated; 11.69 MiB free; 22.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
ERROR diffusion_model.output_blocks.0.1.transformer_blocks.5.attn1.to_out.0.weight CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 22.41 GiB total capacity; 21.56 GiB already allocated; 11.69 MiB free; 22.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
File "D:\Repos\Fooocus\modules\async_worker.py", line 565, in worker
handler(task)
File "D:\Repos\Fooocus\venv\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\Repos\Fooocus\venv\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\Repos\Fooocus\modules\async_worker.py", line 470, in handler
comfy.model_management.load_models_gpu([pipeline.final_unet])
File "D:\Repos\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 397, in load_models_gpu
cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Repos\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 286, in model_load
raise e
File "D:\Repos\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 282, in model_load
self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Repos\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_patcher.py", line 161, in patch_model
temp_weight = comfy.model_management.cast_to_device(weight, device_to, torch.float32, copy=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Repos\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 498, in cast_to_device
return tensor.to(device, copy=copy).to(dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 22.41 GiB total capacity; 21.56 GiB already allocated; 11.69 MiB free; 22.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Total time: 20.41 seconds
`
This always happens when I clicked generate for the second time after opening fooocus.
The first generation is okay, but the second one stops.
Thanks for reply~

@lllyasviel
Copy link
Owner

see if fixed in 2.1.38

@captainpd
Copy link
Author

see if fixed in 2.1.38

Got fixed in 2.1.49
Cheers :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants