Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory #1114

Closed
Darkweasam opened this issue Dec 1, 2023 · 4 comments
Closed

CUDA out of memory #1114

Darkweasam opened this issue Dec 1, 2023 · 4 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@Darkweasam
Copy link

Describe the problem
A clear and concise description of what the bug is.
I get a crash whenever I try to generate something with 6GB GPU and 16GB RAM. Why is that?
Nvidia GeForce GTX 1060 6GB

Full Console Log
Paste full console log here. You will make our job easier if you give a full log.

X:\AI\Fooocus_win64_2-1-791>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --preset realistic
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\entry_with_update.py', '--preset', 'realistic']
Loaded preset: X:\AI\Fooocus_win64_2-1-791\Fooocus\presets\realistic.json
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.824
Running on local URL: http://127.0.0.1:7865

To create a public link, set share=True in launch().
Total VRAM 6144 MB, total RAM 16326 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: cuda:0 NVIDIA GeForce GTX 1060 6GB : native
VAE dtype: torch.float32
Using pytorch cross attention
Refiner unloaded.
model_type EPS
adm 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra keys {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection'}
Base model loaded: X:\AI\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\realisticStockPhoto_v10.safetensors
Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [X:\AI\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\realisticStockPhoto_v10.safetensors].
Loaded LoRA [X:\AI\Fooocus_win64_2-1-791\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [X:\AI\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\realisticStockPhoto_v10.safetensors] with 788 keys at weight 0.25.
Loaded LoRA [X:\AI\Fooocus_win64_2-1-791\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for CLIP [X:\AI\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\realisticStockPhoto_v10.safetensors] with 264 keys at weight 0.25.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 1.40 seconds
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 3.0
[Parameters] Seed = 3598152127187655813
[Fooocus] Downloading control models ...
[Fooocus] Loading control models ...
extra keys clip vision: ['vision_model.embeddings.position_ids']
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] dog, intricate, elegant, highly detailed, extremely beautiful,, symmetry, sharp focus, inspired, charismatic, very coherent, cute, innocent, fine detail, full color, cinematic, winning, artistic, smart, joyful, attractive, pretty, illuminated, colorful, light, cozy, novel, epic, dramatic ambient background, determined, focused, quality, atmosphere
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] dog, intricate, elegant, highly detailed, sharp focus, candid, sublime, dramatic, thought, cinematic, new classic, best, attractive, unique, beautiful, creative, positive, cute, smart, agile, passionate, cheerful, pretty, inspired, color, spread light, magic, cool, friendly, extremely detail, lovely, amazing, flowing, complex
[Fooocus] Encoding positive #1 ...
[Fooocus Model Management] Moving model(s) has taken 0.11 seconds
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Fooocus] Image processing ...
Detected 1 faces
Requested to load CLIPVisionModelWithProjection
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 0.50 seconds
Requested to load Resampler
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 0.46 seconds
Requested to load To_KV
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 0.32 seconds
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (1152, 896)
Preparation time: 14.29 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3194.4163160324097
Traceback (most recent call last):
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 803, in worker
handler(task)
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 735, in handler
imgs = pipeline.process_diffusion(
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 361, in process_diffusion
sampled_latent = core.ksampler(
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\core.py", line 315, in ksampler
samples = fcbh.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sample.py", line 93, in sample
real_model, positive_copy, negative_copy, noise_mask, models = prepare_sampling(model, noise.shape, positive, negative, noise_mask)
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sample.py", line 86, in prepare_sampling
fcbh.model_management.load_models_gpu([model] + models, model.memory_required(noise_shape) + inference_memory)
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\patch.py", line 494, in patched_load_models_gpu
y = fcbh.model_management.load_models_gpu_origin(*args, **kwargs)
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 410, in load_models_gpu
cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 298, in model_load
accelerate.dispatch_model(self.real_model, device_map=device_map, main_device=self.device)
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\big_modeling.py", line 371, in dispatch_model
attach_align_device_hook_on_blocks(
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 536, in attach_align_device_hook_on_blocks
attach_align_device_hook_on_blocks(
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 536, in attach_align_device_hook_on_blocks
attach_align_device_hook_on_blocks(
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 506, in attach_align_device_hook_on_blocks
add_hook_to_module(module, hook)
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 155, in add_hook_to_module
module = hook.init_hook(module)
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 253, in init_hook
set_module_tensor_to_device(module, name, self.execution_device)
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\utils\modeling.py", line 292, in set_module_tensor_to_device
new_value = old_value.to(device)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 30.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 4.15 GiB is free. Of the allocated memory 881.50 MiB is allocated by PyTorch, and 54.50 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Total time: 55.23 seconds
ERROR clip_g.transformer.text_model.encoder.layers.0.mlp.fc1.weight CUDA out of memory. Tried to allocate 26.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 4.10 GiB is free. Of the allocated memory 938.78 MiB is allocated by PyTorch, and 49.22 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Exception in thread Thread-2 (worker):
Traceback (most recent call last):
File "threading.py", line 1016, in _bootstrap_inner
File "threading.py", line 953, in run
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 809, in worker
pipeline.prepare_text_encoder(async_call=True)
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 211, in prepare_text_encoder
fcbh.model_management.load_models_gpu([final_clip.patcher, final_expansion.patcher])
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\patch.py", line 494, in patched_load_models_gpu
y = fcbh.model_management.load_models_gpu_origin(*args, **kwargs)
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 410, in load_models_gpu
cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 293, in model_load
raise e
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 289, in model_load
self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_patcher.py", line 191, in patch_model
temp_weight = fcbh.model_management.cast_to_device(weight, device_to, torch.float32, copy=True)
File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 532, in cast_to_device
return tensor.to(device, copy=copy).to(dtype)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 4.10 GiB is free. Of the allocated memory 915.50 MiB is allocated by PyTorch, and 72.50 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@acvcleitao
Copy link

+1

Repository owner deleted a comment from stubkan Dec 12, 2023
Repository owner deleted a comment from stubkan Dec 12, 2023
Repository owner deleted a comment from Darkweasam Dec 12, 2023
@mashb1t
Copy link
Collaborator

mashb1t commented Dec 30, 2023

@lllyasviel could you find out if NVIDIA GeForce GTX 1060 or generally speaking any 10XX card with 6GB is possible to use or if it's a bug? (https://github.com/lllyasviel/Fooocus/blob/main/troubleshoot.md#i-am-using-nvidia-with-6gb-vram-i-get-cuda-out-of-memory)

@mashb1t mashb1t added bug Something isn't working help wanted Extra attention is needed labels Dec 30, 2023
@mashb1t
Copy link
Collaborator

mashb1t commented Feb 22, 2024

Closing as stale, feel free to provide more information to reopen.

@mashb1t mashb1t closed this as not planned Won't fix, can't repro, duplicate, stale Feb 22, 2024
@Darkweasam
Copy link
Author

Closing as stale, feel free to provide more information to reopen.

If you are asking me, I am unsure what other information could I provide to help fix this...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants