Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to run with Image Prompt Using 6GB memory. #700

Closed
youyegit opened this issue Oct 16, 2023 · 25 comments
Closed

Is it possible to run with Image Prompt Using 6GB memory. #700

youyegit opened this issue Oct 16, 2023 · 25 comments
Labels
enhancement New feature or request

Comments

@youyegit
Copy link

When running with Image Prompt, to generate the preview images always uses less than 4GB, but VAE decoding shows "ran out of memory".
Is it possible to do some optimization or to use the share memory just like ComfyUI?
Thanks the great job and hope Fooocus will be better.

below is the running log ( using RTX 2060 6G ) :

[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 7.0
[Fooocus] Downloading control models ...
[Fooocus] Loading control models ...
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 0.78 seconds
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] New suffix: extremely clean, polished, artstation trend, human is depicted as a tree with apples or oranges, clean strokes
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: intricate, elegant, volumetric lighting, digital painting, highly detailed, artstation, sharp focus, illustration, concept art, ruan jia, steve mccurry
[Fooocus] Encoding positive #1 ...
[Fooocus Model Management] Moving model(s) has taken 0.13 seconds
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Fooocus] Image processing ...
Requested to load CLIPVisionModelWithProjection
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.01 seconds
Requested to load Resampler
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 0.56 seconds
Requested to load To_KV
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 0.24 seconds
Preparation time: 4.19 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.02916753850877285, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 2750.0320749282837
[Fooocus Model Management] Moving model(s) has taken 9.90 seconds
[Sampler] Fooocus sampler is activated.
3%|██▊ | 1/30 [00:02<01:19, 2.75s/i 7%|█████▌ | 2/30 [00:05<01:17, 2.78 10%|████████▎ | 3/30 [00:08<01:15, 2 13%|███████████ | 4/30 [00:11<01:12, 17%|█████████████▊ | 5/30 [00:13<01:0 20%|████████████████▌ | 6/30 [00:16<0 23%|███████████████████▎ | 7/30 [00:1 27%|██████████████████████▏ | 8/30 [0 30%|████████████████████████▉ | 9/30 33%|███████████████████████████▎ | 10/ 37%|██████████████████████████████ | 1 40%|████████████████████████████████▊ 43%|███████████████████████████████████▌ 47%|██████████████████████████████████████▎ 50%|█████████████████████████████████████████ 53%|███████████████████████████████████████████▋ 57%|██████████████████████████████████████████████▍ 60%|█████████████████████████████████████████████████▏ 63%|███████████████████████████████████████████████████▉ 67%|██████████████████████████████████████████████████████▋ 70%|█████████████████████████████████████████████████████████ 73%|█████████████████████████████████████████████████████████ 77%|█████████████████████████████████████████████████████████ 80%|█████████████████████████████████████████████████████████ 83%|█████████████████████████████████████████████████████████ 87%|█████████████████████████████████████████████████████████ 90%|█████████████████████████████████████████████████████████ 93%|█████████████████████████████████████████████████████████ 97%|█████████████████████████████████████████████████████████ 100%|█████████████████████████████████████████████████████████ 100%|█████████████████████████████████████████████████████████ █████████████████████████| 30/30 [01:15<00:00, 2.52s/it]
Warning: Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding.
Traceback (most recent call last):
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\sd.py", line 205, in decode
pixel_samples[x:x+batch_number] = torch.clamp((self.first_stage_model.decode(samples).cpu().float() + 1.0) / 2.0, min=0.0, max=1.0)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\ldm\models\autoencoder.py", line 94, in decode
dec = self.decoder(z)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 713, in forward
h = self.up[i_level].upsample(h)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 71, in forward
x = self.conv(x)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 968.00 MiB (GPU 0; 6.00 GiB total capacity; 4.28 GiB already allocated; 118.80 MiB free; 4.72 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\AI\Z_fc_2023-10-15\Fooocus\modules\async_worker.py", line 576, in worker
handler(task)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\modules\async_worker.py", line 509, in handler
imgs = pipeline.process_diffusion(
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\modules\default_pipeline.py", line 371, in process_diffusion
decoded_latent = core.decode_vae(vae=final_vae, latent_image=sampled_latent, tiled=tiled)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\modules\core.py", line 118, in decode_vae
return opVAEDecode.decode(samples=latent_image, vae=vae)[0]
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\nodes.py", line 267, in decode
return (vae.decode(samples["samples"]), )
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\sd.py", line 208, in decode
pixel_samples = self.decode_tiled_(samples_in)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\sd.py", line 175, in decode_tiled_
fcbh.utils.tiled_scale(samples, decode_fn, tile_x * 2, tile_y // 2, overlap, upscale_amount = 8, pbar = pbar) +
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\utils.py", line 395, in tiled_scale
ps = function(s_in).cpu()
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\sd.py", line 172, in
decode_fn = lambda a: (self.first_stage_model.decode(a.to(self.vae_dtype).to(self.device)) + 1.0).float()
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\ldm\models\autoencoder.py", line 94, in decode
dec = self.decoder(z)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 713, in forward
h = self.up[i_level].upsample(h)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 71, in forward
x = self.conv(x)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 6.00 GiB total capacity; 4.59 GiB already allocated; 0 bytes free; 5.14 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Total time: 96.47 seconds

@lllyasviel
Copy link
Owner

Show full logs.

I know that 3060 6G works. Some 2060 does not support float16 so it may be different

@youyegit
Copy link
Author

Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.675
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
Total VRAM 6144 MB, total RAM 32489 MB
xformers version: 0.0.20
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 2060 : native
VAE dtype: torch.float32
Using xformers cross attention
[Fooocus Smart Memory] Disabling smart memory, vram_inadequate = True, is_old_gpu_arch = False.
model_type EPS
adm 2560
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
missing {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Refiner model loaded: C:\AI\Z_fc_2023-10-15\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors
model_type EPS
adm 2816
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
missing {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection'}
Base model loaded: C:\AI\Z_fc_2023-10-15\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors
LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)]
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 0.79 seconds
App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 7.0
[Fooocus] Downloading control models ...
[Fooocus] Loading control models ...
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] New suffix: extremely detailed oil painting, unreal 5 render, rhads, Bruce Pennington, Studio Ghibli, tim hildebrandt, digital art, octane render, beautiful composition, trending on artstation, award-winning photograph, masterpiece
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: intricate, elegant, volumetric lighting, digital painting, highly detailed, artstation, sharp focus, illustration, concept art,ruan jia, steve mccurry and zdislav beksinski
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Fooocus] Image processing ...
Preparation time: 2.43 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.02916753850877285, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3026.0247888565063
[Fooocus Model Management] Moving model(s) has taken 10.45 seconds
[Sampler] Fooocus sampler is activated.
3%|██▊ | 1/30 [00:03<01:32, 3.21s/i 7%|█████▌ | 2/30 [00:05<01:17, 2.78 10%|████████▎ | 3/30 [00:08<01:11, 2 13%|███████████ | 4/30 [00:10<01:07, 17%|█████████████▊ | 5/30 [00:13<01:0 20%|████████████████▌ | 6/30 [00:15<0 23%|███████████████████▎ | 7/30 [00:1 27%|██████████████████████▏ | 8/30 [0 30%|████████████████████████▉ | 9/30 33%|███████████████████████████▎ | 10/ 37%|██████████████████████████████ | 1 40%|████████████████████████████████▊ 43%|███████████████████████████████████▌ 47%|██████████████████████████████████████▎ 50%|█████████████████████████████████████████ 53%|███████████████████████████████████████████▋ 57%|██████████████████████████████████████████████▍ 60%|█████████████████████████████████████████████████▏ 63%|███████████████████████████████████████████████████▉ 67%|██████████████████████████████████████████████████████▋ | 20/30 [00:47<00:21, 2.12s/it]Requested to load SDXLRefiner
Loading 1 new model
loading in lowvram mode 851.8969345092773
[Fooocus Model Management] Moving model(s) has taken 0.64 seconds
Refiner Swapped
70%|█████████████████████████████████████████████████████████ 73%|█████████████████████████████████████████████████████████ 77%|█████████████████████████████████████████████████████████ 80%|█████████████████████████████████████████████████████████ 83%|█████████████████████████████████████████████████████████ 87%|█████████████████████████████████████████████████████████ 90%|█████████████████████████████████████████████████████████ 93%|█████████████████████████████████████████████████████████ 97%|█████████████████████████████████████████████████████████ 100%|█████████████████████████████████████████████████████████ 100%|█████████████████████████████████████████████████████████ █████████████████████████| 30/30 [01:18<00:00, 2.61s/it]
Warning: Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding.
Traceback (most recent call last):
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\sd.py", line 205, in decode
pixel_samples[x:x+batch_number] = torch.clamp((self.first_stage_model.decode(samples).cpu().float() + 1.0) / 2.0, min=0.0, max=1.0)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\ldm\models\autoencoder.py", line 94, in decode
dec = self.decoder(z)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 713, in forward
h = self.up[i_level].upsample(h)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 71, in forward
x = self.conv(x)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1008.00 MiB (GPU 0; 6.00 GiB total capacity; 4.30 GiB already allocated; 0 bytes free; 5.03 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\AI\Z_fc_2023-10-15\Fooocus\modules\async_worker.py", line 576, in worker
handler(task)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\modules\async_worker.py", line 509, in handler
imgs = pipeline.process_diffusion(
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\modules\default_pipeline.py", line 371, in process_diffusion
decoded_latent = core.decode_vae(vae=final_vae, latent_image=sampled_latent, tiled=tiled)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\modules\core.py", line 118, in decode_vae
return opVAEDecode.decode(samples=latent_image, vae=vae)[0]
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\nodes.py", line 267, in decode
return (vae.decode(samples["samples"]), )
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\sd.py", line 208, in decode
pixel_samples = self.decode_tiled_(samples_in)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\sd.py", line 174, in decode_tiled_
(fcbh.utils.tiled_scale(samples, decode_fn, tile_x // 2, tile_y * 2, overlap, upscale_amount = 8, pbar = pbar) +
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\utils.py", line 395, in tiled_scale
ps = function(s_in).cpu()
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\sd.py", line 172, in
decode_fn = lambda a: (self.first_stage_model.decode(a.to(self.vae_dtype).to(self.device)) + 1.0).float()
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\ldm\models\autoencoder.py", line 94, in decode
dec = self.decoder(z)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 713, in forward
h = self.up[i_level].upsample(h)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 71, in forward
x = self.conv(x)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "C:\AI\Z_fc_2023-10-15\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 224.00 MiB (GPU 0; 6.00 GiB total capacity; 4.58 GiB already allocated; 0 bytes free; 5.03 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Total time: 94.94 seconds

@youyegit
Copy link
Author

youyegit commented Oct 16, 2023

qgLZTTe5YJ
Other settings are all default.
Before statting Fooocus, the GPU memory that has been used is 0.3 GB.
other hardware or software content: window10 x64\32g \RTX2060-6G

@lllyasviel
Copy link
Owner

does text-to-image without image prompt work?

  1. if you are not using cuda12, try to download the latest release of Fooocus again ( but you can ignore it if you download that already
  2. try to add '--use-split-cross-attention' to bat file, like .\python_embeded\python.exe -s Fooocus\entry_with_update.py --use-split-cross-attention
  3. if still does not work then try --use-quad-cross-attention.
  4. If still not work then you need to wait us to solve this

@youyegit
Copy link
Author

text-to-image without image prompt work well.
I will try the methods and give a feedback.

@lllyasviel
Copy link
Owner

I also added previous_old_xformers_env.7z if you want to try

https://github.com/lllyasviel/Fooocus/releases/tag/release

it can patch and change your env to CUDA 11.8 and Pytorch2.0 with an old version of xformers. I do not think users really need it but if it works then I will change my mind

@youyegit
Copy link
Author

To add --use-split-cross-attention or --use-quad-cross-attention didnot work.
I will test previous_old_xformers_env.7z.

@youyegit
Copy link
Author

I have test previous_old_xformers_env.7z and it produces the same result.
But I may find the problem comes from using PyraCanny or CPDS, because it works when using Image Prompt only .

@youyegit
Copy link
Author

Besides, I have try the same folders with other GPU -- RTX3080 12G, it works all well in the same env and the same Fooocus.

@lllyasviel lllyasviel added the enhancement New feature or request label Oct 16, 2023
@deadpipe
Copy link

deadpipe commented Oct 17, 2023

I have the same problem on my GTX 1060 6GB graphics card. When I use the Image prompt with PyraCanny and the image processing reaches 100%, I get an out-of-memory error during VAE decoding, and the image doesn't save.

My PC config is :-
i7 7700K
GTX 1060 6GB
32 GB System Memory

Graphics Driver version is 531.18, as mentioned in the main repo page
CUDA version is 12.1

Error:

[Fooocus Model Management] Moving model(s) has taken 7.65 seconds
[Sampler] Fooocus sampler is activated.
67%|██████████████████████████████████████████████████████▋ | 40/60 [05:50<02:23, 7.19s/it]Requested to load SDXLRefiner
Loading 1 new model
loading in lowvram mode 1027.2795219421387
[Fooocus Model Management] Moving model(s) has taken 0.72 seconds
Refiner Swapped
100%|██████████████████████████████████████████████████████████████████████████████████| 60/60 [09:05<00:00, 9.09s/it]
Warning: Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding.
Traceback (most recent call last):
File "G:\Games\Foocus\Fooocus\backend\headless\fcbh\sd.py", line 205, in decode
pixel_samples[x:x+batch_number] = torch.clamp((self.first_stage_model.decode(samples).cpu().float() + 1.0) / 2.0, min=0.0, max=1.0)
File "G:\Games\Foocus\Fooocus\backend\headless\fcbh\ldm\models\autoencoder.py", line 94, in decode
dec = self.decoder(z)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "G:\Games\Foocus\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 635, in forward
h = self.up[i_level].upsample(h)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "G:\Games\Foocus\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 71, in forward
x = self.conv(x)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 460, in forward
return self._conv_forward(input, self.weight, self.bias)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 456, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 968.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 57.00 MiB is free. Of the allocated memory 4.25 GiB is allocated by PyTorch, and 773.37 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "G:\Games\Foocus\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 59, in forward
x = torch.nn.functional.interpolate(x, scale_factor=2.0, mode="nearest")
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\functional.py", line 3983, in interpolate
return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 0 bytes is free. Of the allocated memory 4.32 GiB is allocated by PyTorch, and 837.25 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "G:\Games\Foocus\Fooocus\modules\async_worker.py", line 584, in worker
handler(task)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "G:\Games\Foocus\Fooocus\modules\async_worker.py", line 517, in handler
imgs = pipeline.process_diffusion(
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "G:\Games\Foocus\Fooocus\modules\default_pipeline.py", line 371, in process_diffusion
decoded_latent = core.decode_vae(vae=final_vae, latent_image=sampled_latent, tiled=tiled)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "G:\Games\Foocus\Fooocus\modules\core.py", line 118, in decode_vae
return opVAEDecode.decode(samples=latent_image, vae=vae)[0]
File "G:\Games\Foocus\Fooocus\backend\headless\nodes.py", line 267, in decode
return (vae.decode(samples["samples"]), )
File "G:\Games\Foocus\Fooocus\backend\headless\fcbh\sd.py", line 208, in decode
pixel_samples = self.decode_tiled_(samples_in)
File "G:\Games\Foocus\Fooocus\backend\headless\fcbh\sd.py", line 174, in decode_tiled_
(fcbh.utils.tiled_scale(samples, decode_fn, tile_x // 2, tile_y * 2, overlap, upscale_amount = 8, pbar = pbar) +
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "G:\Games\Foocus\Fooocus\backend\headless\fcbh\utils.py", line 395, in tiled_scale
ps = function(s_in).cpu()
File "G:\Games\Foocus\Fooocus\backend\headless\fcbh\sd.py", line 172, in
decode_fn = lambda a: (self.first_stage_model.decode(a.to(self.vae_dtype).to(self.device)) + 1.0).float()
File "G:\Games\Foocus\Fooocus\backend\headless\fcbh\ldm\models\autoencoder.py", line 94, in decode
dec = self.decoder(z)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "G:\Games\Foocus\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 635, in forward
h = self.up[i_level].upsample(h)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(args, **kwargs)
File "G:\Games\Foocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(args, **kwargs)
File "G:\Games\Foocus\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\model.py", line 62, in forward
out = torch.empty((b, c, h
2, w
2), dtype=x.dtype, layout=x.layout, device=x.device)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 0 bytes is free. Of the allocated memory 4.32 GiB is allocated by PyTorch, and 837.25 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Total time: 578.97 seconds

@youyegit
Copy link
Author

I have tested SDXL in comfyui with RTX2060 6G,
when I use "sai_xl_canny_128lora.safetensors" or "sai_xl_depth_128lora.safetensors", it will show "Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding."
but when I use "diffusers_xl_canny_small.safetensors" or "diffusers_xl_depth_small.safetensors", it works well.

when I use clip_vison, it works well too.

So the problem may come from comfyui's nodes.

@lllyasviel
Copy link
Owner

in fooocus you should also see "Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding" and then get the image?

@youyegit
Copy link
Author

in fooocus I got the same " Warning: Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding" and then I did not get any image.

@lllyasviel
Copy link
Owner

any log? this is different from your previous report i think

@lllyasviel
Copy link
Owner

ah i see the warning in log - will take a look

@youyegit
Copy link
Author

This the log of comfyui when using "sai_xl_canny_128lora.safetensors" :

Now the UI folder is E:\Z_comfyui_2023-10-17
This integrated installation package is packaged by @BEIMINGYOUYU
Python 3.10.6
pip 23.1.2 from E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\pip (python 3.10)
** ComfyUI start up time: 2023-10-18 13:03:59.815647

Prestartup times for custom nodes:
0.0 seconds: E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\ComfyUI-Manager

Total VRAM 6144 MB, total RAM 32489 MB
xformers version: 0.0.20
Forcing FP16.
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 2060 : cudaMallocAsync
VAE dtype: torch.float32
disabling upcasting of attention
Using xformers cross attention
Setting temp directory to: E:\Z_comfyui_2023-10-17\temp\temp
Adding extra search path checkpoints E:/AI/SD/sd-webui/stable-diffusion-webui/models/Stable-diffusion
Adding extra search path configs E:/AI/SD/sd-webui/stable-diffusion-webui/models/Stable-diffusion
Adding extra search path vae E:/AI/SD/sd-webui/stable-diffusion-webui/models/VAE
Adding extra search path loras E:/AI/SD/sd-webui/stable-diffusion-webui/models/Lora
Adding extra search path upscale_models E:/AI/SD/sd-webui/stable-diffusion-webui/models/ESRGAN
Adding extra search path upscale_models E:/AI/SD/sd-webui/stable-diffusion-webui/models/SwinIR
Adding extra search path embeddings E:/AI/SD/sd-webui/stable-diffusion-webui/embeddings
Adding extra search path hypernetworks E:/AI/SD/sd-webui/stable-diffusion-webui/models/hypernetworks
Adding extra search path controlnet E:/AI/SD/sd-webui/stable-diffusion-webui/models/ControlNet
Adding extra search path checkpoints ../extra_model/models/checkpoints
Adding extra search path configs ../extra_model/models/configs
Adding extra search path vae ../extra_model/models/vae
Adding extra search path loras ../extra_model/models/loras
Adding extra search path clip_vision ../extra_model/models/clip_vision
Adding extra search path controlnet ../extra_model/models/controlnet
Adding extra search path custom_nodes ../extra_model/custom_nodes

Loading: ComfyUI-Manager (V0.30.3)

ComfyUI Revision: 1479 [f8032cdf]

Registered sys.path: ['E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\init.py', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_pycocotools', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_oneformer', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_mmpkg', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_midas_repo', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_detectron2', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\controlnet_aux', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src', 'E:\Z_comfyui_2023-10-17\ComfyUI\comfy', 'E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\git\ext\gitdb', 'E:\Z_comfyui_2023-10-17\ComfyUI', 'E:\Z_comfyui_2023-10-17\python_embeded\python310.zip', 'E:\Z_comfyui_2023-10-17\python_embeded\DLLs', 'E:\Z_comfyui_2023-10-17\python_embeded\lib', 'E:\Z_comfyui_2023-10-17\python_embeded', 'E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages', 'E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\win32', 'E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\win32\lib', 'E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\Pythonwin', '../..']
Registered sys.path: ['E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\init.py', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_pycocotools', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_oneformer', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_mmpkg', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_midas_repo', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_detectron2', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\controlnet_aux', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\init.py', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_pycocotools', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_oneformer', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_mmpkg', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_midas_repo', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\custom_detectron2', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src\controlnet_aux', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux\src', 'E:\Z_comfyui_2023-10-17\ComfyUI\comfy', 'E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\git\ext\gitdb', 'E:\Z_comfyui_2023-10-17\ComfyUI', 'E:\Z_comfyui_2023-10-17\python_embeded\python310.zip', 'E:\Z_comfyui_2023-10-17\python_embeded\DLLs', 'E:\Z_comfyui_2023-10-17\python_embeded\lib', 'E:\Z_comfyui_2023-10-17\python_embeded', 'E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages', 'E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\win32', 'E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\win32\lib', 'E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\Pythonwin', '../..', 'E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\tdxh_node_comfyui']
Total VRAM 6144 MB, total RAM 32489 MB
xformers version: 0.0.20
Forcing FP16.
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 2060 : cudaMallocAsync
VAE dtype: torch.float32
WARNING: Ignoring invalid distribution -pencv-python (e:\z_comfyui_2023-10-17\python_embeded\lib\site-packages)
WAS Node Suite: OpenCV Python FFMPEG support is enabled
WAS Node Suite: ffmpeg_bin_path is set to: \ffmpeg-6.0-essentials_build\bin
WARNING: Ignoring invalid distribution -pencv-python (e:\z_comfyui_2023-10-17\python_embeded\lib\site-packages)
WAS Node Suite: Finished. Loaded 193 nodes successfully.

    "Believe you deserve it and the universe will serve it." - Unknown

Import times for custom nodes:
0.0 seconds: E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\reference_only.py
0.0 seconds: E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\ControlNet-LLLite-ComfyUI
0.0 seconds: E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\sdxl_prompt_styler
0.0 seconds: E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\ComfyUI-Custom-Scripts
0.0 seconds: E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\Derfuu_ComfyUI_ModdedNodes
0.0 seconds: E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\tdxh_node_comfyui
0.1 seconds: E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\comfyui_controlnet_aux
0.4 seconds: E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\ComfyUI-Manager
1.8 seconds: E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\was_node_suite_comfyui

Setting output directory to: E:\Z_comfyui_2023-10-17\output
Setting input directory to: E:\Z_comfyui_2023-10-17\input
Starting server

To see the GUI go to: http://0.0.0.0:8188
FETCH DATA from: E:\Z_comfyui_2023-10-17\ComfyUI\custom_nodes\ComfyUI-Manager\extension-node-map.json
got prompt
style: Default (Slightly Cinematic)
text_positive: school gate
text_negative:
text_positive_styled: cinematic still school gate . emotional, harmonious, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy
text_negative_styled: anime, cartoon, graphic, text, painting, crayon, graphite, abstract, glitch, deformed, mutated, ugly, disfigured
model_type EPS
adm 2560
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
missing {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
loading new
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Reshaping decoder.mid.attn_1.k.weight for SD format
Reshaping decoder.mid.attn_1.proj_out.weight for SD format
Reshaping decoder.mid.attn_1.q.weight for SD format
Reshaping decoder.mid.attn_1.v.weight for SD format
Reshaping encoder.mid.attn_1.k.weight for SD format
Reshaping encoder.mid.attn_1.proj_out.weight for SD format
Reshaping encoder.mid.attn_1.q.weight for SD format
Reshaping encoder.mid.attn_1.v.weight for SD format
model_type EPS
adm 2816
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
missing {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
loading new
loading new
loading in lowvram mode 3017.959864616394
4%|███▌ | 1/25 [00:05<02:17, 5.71s/ 8%|██████▌ | 2/25 [00:10<01:58, 5.1 12%|█████████▌ | 3/25 [00:15<01:49, 16%|█████████████▌ | 4/25 [00:19<01:4 20%|████████████████▌ | 5/25 [00:24<0 24%|███████████████████▌ | 6/25 [00:2 28%|███████████████████████▌ | 7/25 [ 32%|██████████████████████████▌ | 8/2 36%|█████████████████████████████▌ | 40%|████████████████████████████████▌ 44%|████████████████████████████████████ 48%|███████████████████████████████████████▌ 52%|██████████████████████████████████████████▌ 56%|█████████████████████████████████████████████▌ 60%|█████████████████████████████████████████████████▌ 64%|████████████████████████████████████████████████████▌ 68%|███████████████████████████████████████████████████████▌ 72%|█████████████████████████████████████████████████████████ 76%|█████████████████████████████████████████████████████████ 80%|█████████████████████████████████████████████████████████ 84%|█████████████████████████████████████████████████████████ 88%|█████████████████████████████████████████████████████████ 92%|█████████████████████████████████████████████████████████ 96%|█████████████████████████████████████████████████████████ 100%|█████████████████████████████████████████████████████████ 100%|█████████████████████████████████████████████████████████ █████████████████████████| 25/25 [01:50<00:00, 4.42s/it]
loading new
loading in lowvram mode 880.3931360244751
20%|████████████████▌ | 1/5 [00:05<0 40%|█████████████████████████████████▌ 60%|██████████████████████████████████████████████████▌ 80%|█████████████████████████████████████████████████████████ 100%|█████████████████████████████████████████████████████████ 100%|█████████████████████████████████████████████████████████ ███████████████████████████| 5/5 [00:23<00:00, 4.66s/it]
Warning: Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding.
!!! Exception during processing !!!
Traceback (most recent call last):
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\sd.py", line 203, in decode
pixel_samples[x:x+batch_number] = torch.clamp((self.first_stage_model.decode(samples) + 1.0) / 2.0, min=0.0, max=1.0).cpu().float()
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\ldm\models\autoencoder.py", line 94, in decode
dec = self.decoder(z)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 726, in forward
h = self.up[i_level].upsample(h)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 72, in forward
x = self.conv(x)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: Allocation on device 0 would exceed allowed memory. (out of memory)
Currently allocated : 4.27 GiB
Requested : 1012.00 MiB
Device limit : 6.00 GiB
Free (according to CUDA): 0 bytes
PyTorch limit (set by user-supplied memory fraction)
: 17179869184.00 GiB

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:\Z_comfyui_2023-10-17\ComfyUI\execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "E:\Z_comfyui_2023-10-17\ComfyUI\execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "E:\Z_comfyui_2023-10-17\ComfyUI\execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "E:\Z_comfyui_2023-10-17\ComfyUI\nodes.py", line 267, in decode
return (vae.decode(samples["samples"]), )
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\sd.py", line 206, in decode
pixel_samples = self.decode_tiled_(samples_in)
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\sd.py", line 173, in decode_tiled_
comfy.utils.tiled_scale(samples, decode_fn, tile_x * 2, tile_y // 2, overlap, upscale_amount = 8, pbar = pbar) +
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\utils.py", line 395, in tiled_scale
ps = function(s_in).cpu()
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\sd.py", line 170, in
decode_fn = lambda a: (self.first_stage_model.decode(a.to(self.vae_dtype).to(self.device)) + 1.0).float()
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\ldm\models\autoencoder.py", line 94, in decode
dec = self.decoder(z)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 726, in forward
h = self.up[i_level].upsample(h)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 72, in forward
x = self.conv(x)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: Allocation on device 0 would exceed allowed memory. (out of memory)
Currently allocated : 4.58 GiB
Requested : 256.00 MiB
Device limit : 6.00 GiB
Free (according to CUDA): 0 bytes
PyTorch limit (set by user-supplied memory fraction)
: 17179869184.00 GiB

Prompt executed in 203.19 seconds

@youyegit
Copy link
Author

and in comfyui user interface , it show the error:

Error occurred when executing VAEDecode:

Allocation on device 0 would exceed allowed memory. (out of memory)
Currently allocated : 4.58 GiB
Requested : 256.00 MiB
Device limit : 6.00 GiB
Free (according to CUDA): 0 bytes
PyTorch limit (set by user-supplied memory fraction)
: 17179869184.00 GiB

File "E:\Z_comfyui_2023-10-17\ComfyUI\execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "E:\Z_comfyui_2023-10-17\ComfyUI\execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "E:\Z_comfyui_2023-10-17\ComfyUI\execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "E:\Z_comfyui_2023-10-17\ComfyUI\nodes.py", line 267, in decode
return (vae.decode(samples["samples"]), )
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\sd.py", line 206, in decode
pixel_samples = self.decode_tiled_(samples_in)
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\sd.py", line 173, in decode_tiled_
comfy.utils.tiled_scale(samples, decode_fn, tile_x * 2, tile_y // 2, overlap, upscale_amount = 8, pbar = pbar) +
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\utils.py", line 395, in tiled_scale
ps = function(s_in).cpu()
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\sd.py", line 170, in
decode_fn = lambda a: (self.first_stage_model.decode(a.to(self.vae_dtype).to(self.device)) + 1.0).float()
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\ldm\models\autoencoder.py", line 94, in decode
dec = self.decoder(z)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 726, in forward
h = self.up[i_level].upsample(h)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "E:\Z_comfyui_2023-10-17\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 72, in forward
x = self.conv(x)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "E:\Z_comfyui_2023-10-17\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,

@lllyasviel
Copy link
Owner

so do you get image from comfyui?

@youyegit
Copy link
Author

youyegit commented Oct 18, 2023

no image in comfyui

@lllyasviel
Copy link
Owner

lllyasviel commented Oct 18, 2023

interesting.
so if fooocus fix this then fooocus will be the only way to use control-lora on 2060?
wait I think someone reported webui control-lora worked on 2060 so perhaps so perpahs fooocus is the second

@lllyasviel
Copy link
Owner

working on it

@lllyasviel
Copy link
Owner

try 2.1.695 see if fixed

@youyegit
Copy link
Author

It works all well now !
godlike !

@lllyasviel
Copy link
Owner

hi @youyegit Fooocus 2.1.700 used another method. feel free to try again ane let us know if it works

@youyegit
Copy link
Author

I try Fooocus 2.1.703 and it works well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants