Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image not generating on AMD windows #1078

Closed
napstar-420 opened this issue Nov 30, 2023 · 35 comments
Closed

Image not generating on AMD windows #1078

napstar-420 opened this issue Nov 30, 2023 · 35 comments
Labels
bug Something isn't working question Further information is requested

Comments

@napstar-420
Copy link

Describe the problem
After adding the modal in the checkpoint folder and replacing the content in bat file with

.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y
.\python_embeded\python.exe -m pip install torch-directml
.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml
pause

Cause i have a AMD GPU the app started successfully and but after writing the prompt i starts the loading but no image, below is the console please let me know what i am doing wrong

Full Console Log

D:\Fooocus_win64_2-1-791>.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y
Found existing installation: torch 2.0.0
Uninstalling torch-2.0.0:
  Successfully uninstalled torch-2.0.0
Found existing installation: torchvision 0.15.1
Uninstalling torchvision-0.15.1:
  Successfully uninstalled torchvision-0.15.1
WARNING: Skipping torchaudio as it is not installed.
WARNING: Skipping torchtext as it is not installed.
WARNING: Skipping functorch as it is not installed.
WARNING: Skipping xformers as it is not installed.

D:\Fooocus_win64_2-1-791>.\python_embeded\python.exe -m pip install torch-directml
Requirement already satisfied: torch-directml in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (0.2.0.dev230426)
Collecting torch==2.0.0 (from torch-directml)
  Using cached torch-2.0.0-cp310-cp310-win_amd64.whl (172.3 MB)
Collecting torchvision==0.15.1 (from torch-directml)
  Using cached torchvision-0.15.1-cp310-cp310-win_amd64.whl (1.2 MB)
Requirement already satisfied: filelock in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.12.2)
Requirement already satisfied: typing-extensions in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (4.7.1)
Requirement already satisfied: sympy in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (1.12)
Requirement already satisfied: networkx in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1)
Requirement already satisfied: jinja2 in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1.2)
Requirement already satisfied: numpy in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (1.23.5)
Requirement already satisfied: requests in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (2.31.0)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (9.2.0)
Requirement already satisfied: MarkupSafe>=2.0 in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from jinja2->torch==2.0.0->torch-directml) (2.1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.1.0)
Requirement already satisfied: idna<4,>=2.5 in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2.0.3)
Requirement already satisfied: certifi>=2017.4.17 in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2023.5.7)
Requirement already satisfied: mpmath>=0.19 in d:\fooocus_win64_2-1-791\python_embeded\lib\site-packages (from sympy->torch==2.0.0->torch-directml) (1.3.0)
DEPRECATION: torchsde 0.2.5 has a non-standard dependency specifier numpy>=1.19.*; python_version >= "3.7". pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of torchsde or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
Installing collected packages: torch, torchvision
  WARNING: The scripts convert-caffe2-to-onnx.exe, convert-onnx-to-caffe2.exe and torchrun.exe are installed in 'D:\Fooocus_win64_2-1-791\python_embeded\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed torch-2.0.0 torchvision-0.15.1

[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: D:\Fooocus_win64_2-1-791\python_embeded\python.exe -m pip install --upgrade pip

D:\Fooocus_win64_2-1-791>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\\entry_with_update.py', '--directml']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.824
Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
Using directml with device:
Total VRAM 1024 MB, total RAM 16328 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Refiner unloaded.
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra keys {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
Base model loaded: D:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [D:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [D:\Fooocus_win64_2-1-791\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [D:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 6746146692746138911
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] a beautiful cat, focus, cinematic, intricate, elegant, highly detailed, extremely artistic, placed perfect, professional romantic, cute, creative, winning, best detail, dramatic, attractive, adorable, awesome, inspired, pretty, futuristic background, ambient light, sharp, magic, novel, frank, real, full color, very inspirational, bright, illuminated, deep, great composition
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] a beautiful cat, focus, highly detailed, cinematic, intricate, elegant, professional spiritual illuminated, original light, fine detail, full color, shiny, attractive, pretty, inspired, amazing, symmetry, creative, perfect, colorful, best, new, great, romantic, thought, fancy, marvelous, fabulous, pure, wonderful, coherent, cute, lovely, enormous, bright
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 13.76 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Traceback (most recent call last):
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 803, in worker
    handler(task)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 735, in handler
    imgs = pipeline.process_diffusion(
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 354, in process_diffusion
    modules.patch.BrownianTreeNoiseSamplerPatched.global_init(
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\patch.py", line 173, in global_init
    BrownianTreeNoiseSamplerPatched.tree = BatchedBrownianTree(x, t0, t1, seed, cpu=cpu)
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\k_diffusion\sampling.py", line 85, in __init__
    self.trees = [torchsde.BrownianTree(t0, w0, t1, entropy=s, **kwargs) for s in seed]
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\k_diffusion\sampling.py", line 85, in <listcomp>
    self.trees = [torchsde.BrownianTree(t0, w0, t1, entropy=s, **kwargs) for s in seed]
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torchsde\_brownian\derived.py", line 155, in __init__
    self._interval = brownian_interval.BrownianInterval(t0=t0,
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py", line 540, in __init__
    W = self._randn(initial_W_seed) * math.sqrt(t1 - t0)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py", line 234, in _randn
    return _randn(size, self._top._dtype, self._top._device, seed)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py", line 32, in _randn
    generator = torch.Generator(device).manual_seed(int(seed))
RuntimeError: Device type privateuseone is not supported for torch.Generator() api.
Total time: 14.13 seconds
@grzybon123
Copy link

The exact same thing happened to me

@superpoussin22
Copy link

check #624

@napstar-420
Copy link
Author

This #624 (comment) solved my issue but now getting this error

RuntimeError: Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available!
Total time: 71.20 seconds

@napstar-420 napstar-420 changed the title Nothing happening Image not generating on AMD windows Nov 30, 2023
@napstar-420
Copy link
Author

New log

Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
Using directml with device:
Total VRAM 1024 MB, total RAM 16328 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Refiner unloaded.
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra keys {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: D:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [D:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [D:\Fooocus_win64_2-1-791\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [D:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 1887213433053101641
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] A cat, colorful, magic, vivid colors, elegant, highly detailed, extremely professional, cinematic, sincere, dramatic, artistic, passionate, color full, intricate, beautiful, attractive, enhanced, rich, surreal, sharp focus, elaborate, complex, amazing composition, fancy background, open flowing, lovely, epic, coherent, awesome, brilliant, creative, positive, wonderful
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] A cat, full focus, professional, emotional, cute, elegant, intricate, highly detailed, cool light, sharp background, elaborate, amazing composition, colorful, romantic, symmetry, illuminated, shiny, enhanced, brilliant, epic, joyful, pure, focused, creative, awesome, dramatic ambient, cinematic, artistic, extremely beautiful, stunning, gorgeous, breathtaking, scenic, thought
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 13.94 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
ERROR diffusion_model.output_blocks.1.1.transformer_blocks.2.ff.net.0.proj.weight Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available!
ERROR diffusion_model.output_blocks.1.1.transformer_blocks.2.ff.net.2.weight Could not allocate tensor with 26214400 bytes. There is not enough GPU video memory available!
Traceback (most recent call last):
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 803, in worker
    handler(task)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 735, in handler
    imgs = pipeline.process_diffusion(
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 361, in process_diffusion
    sampled_latent = core.ksampler(
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\core.py", line 315, in ksampler
    samples = fcbh.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sample.py", line 93, in sample
    real_model, positive_copy, negative_copy, noise_mask, models = prepare_sampling(model, noise.shape, positive, negative, noise_mask)
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sample.py", line 86, in prepare_sampling
    fcbh.model_management.load_models_gpu([model] + models, model.memory_required(noise_shape) + inference_memory)
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\patch.py", line 494, in patched_load_models_gpu
    y = fcbh.model_management.load_models_gpu_origin(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 410, in load_models_gpu
    cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 293, in model_load
    raise e
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 289, in model_load
    self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_patcher.py", line 191, in patch_model
    temp_weight = fcbh.model_management.cast_to_device(weight, device_to, torch.float32, copy=True)
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 532, in cast_to_device
    return tensor.to(device, copy=copy).to(dtype)
RuntimeError: Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available!
Total time: 67.32 seconds

@asakurato
Copy link

Same for me, RX 6800, Windows 11

@lowlyphe
Copy link

memory allocation issue here as well. RTX 4060 Laptop, 8GB

@asakurato
Copy link

New log

Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
Using directml with device:
Total VRAM 1024 MB, total RAM 16328 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Refiner unloaded.
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra keys {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: D:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [D:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [D:\Fooocus_win64_2-1-791\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [D:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 1887213433053101641
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] A cat, colorful, magic, vivid colors, elegant, highly detailed, extremely professional, cinematic, sincere, dramatic, artistic, passionate, color full, intricate, beautiful, attractive, enhanced, rich, surreal, sharp focus, elaborate, complex, amazing composition, fancy background, open flowing, lovely, epic, coherent, awesome, brilliant, creative, positive, wonderful
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] A cat, full focus, professional, emotional, cute, elegant, intricate, highly detailed, cool light, sharp background, elaborate, amazing composition, colorful, romantic, symmetry, illuminated, shiny, enhanced, brilliant, epic, joyful, pure, focused, creative, awesome, dramatic ambient, cinematic, artistic, extremely beautiful, stunning, gorgeous, breathtaking, scenic, thought
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 13.94 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
ERROR diffusion_model.output_blocks.1.1.transformer_blocks.2.ff.net.0.proj.weight Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available!
ERROR diffusion_model.output_blocks.1.1.transformer_blocks.2.ff.net.2.weight Could not allocate tensor with 26214400 bytes. There is not enough GPU video memory available!
Traceback (most recent call last):
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 803, in worker
    handler(task)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 735, in handler
    imgs = pipeline.process_diffusion(
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 361, in process_diffusion
    sampled_latent = core.ksampler(
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\core.py", line 315, in ksampler
    samples = fcbh.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sample.py", line 93, in sample
    real_model, positive_copy, negative_copy, noise_mask, models = prepare_sampling(model, noise.shape, positive, negative, noise_mask)
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sample.py", line 86, in prepare_sampling
    fcbh.model_management.load_models_gpu([model] + models, model.memory_required(noise_shape) + inference_memory)
  File "D:\Fooocus_win64_2-1-791\Fooocus\modules\patch.py", line 494, in patched_load_models_gpu
    y = fcbh.model_management.load_models_gpu_origin(*args, **kwargs)
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 410, in load_models_gpu
    cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 293, in model_load
    raise e
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 289, in model_load
    self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_patcher.py", line 191, in patch_model
    temp_weight = fcbh.model_management.cast_to_device(weight, device_to, torch.float32, copy=True)
  File "D:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 532, in cast_to_device
    return tensor.to(device, copy=copy).to(dtype)
RuntimeError: Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available!
Total time: 67.32 seconds

I have also tried editing model_management.py in .\Fooocus\backend\headless\fcbh, where I changed mem_total = 12000 * 1024 * 1024 #TODO (in this case, I changed default value to 12000). I'm getting 12000mb in the console, but it's still very slow and crashes with out of memory error.

@MrDakCol
Copy link

Same issue Here: Windows 11 on a AMD 7900XT with 20GB. But starting only "see" 1024MB

@Arakis14
Copy link

Hi all, I have the same problem as described in this issue meaning that I am on Windows with AMD GPU (RX 5700 XT), I edited the .bat file as described in readme.md section Windows(AMD GPUs) and applied the fix from #624 (comment).

Currently getting "Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available!" error after entering a prompt.
From my observation seems like it's not the lack of GPU memory as it may seem at the first glance. Looks like it is allocating memory but for some reason doing it over and over again until GPU runs out of VRAM.
Last non error log is: "Loading 1 new model" which from what I saw in positive scenarios takes like few seconds max,
I even changed my mem_total to 8GB to see if it will result in anything different, but it didn't.
mem_total = 8192 * 1024 * 1024 #TODO

Please see how my GPU's usage looks like during idle and once the error happens in the below screenshots:
idleGPU
outofmemoryGPU

Also, please see the full log:
F:\AI>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\entry_with_update.py', '--directml']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.824
Running on local URL: http://127.0.0.1:7865

To create a public link, set share=True in launch().
Using directml with device:
Total VRAM 8192 MB, total RAM 16292 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Refiner unloaded.
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra keys {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Base model loaded: F:\AI\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [F:\AI\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [F:\AI\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [F:\AI\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 1934162106525646012
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] highway to hell, highly detailed, intricate, sharp focus, cinematic light, directed, vivid colors, theatrical, mystical, artistic, rich deep color, striking, beautiful, symmetry, stunning, gorgeous, very inspirational, epic, full creative, winning, iconic, fine, amazing, awesome, surreal, colossal, cool, incredible, extremely attractive, best, glowing, magical
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] highway to hell, highly detailed, cinematic, breathtaking, progressive composition, magical atmosphere, beautiful light, sharp focus, very inspirational, ambient, intricate, elegant, innocent, fine detail, inspired, rich deep color, amazing, great colors, perfect, epic, winning full background, professional, clear, trendy, best, novel, romantic, iconic, gorgeous, dramatic
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 15.93 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
ERROR diffusion_model.output_blocks.1.1.transformer_blocks.2.ff.net.0.proj.weight Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available!
Traceback (most recent call last):
File "F:\AI\Fooocus\modules\async_worker.py", line 803, in worker
handler(task)
File "F:\AI\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "F:\AI\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "F:\AI\Fooocus\modules\async_worker.py", line 735, in handler
imgs = pipeline.process_diffusion(
File "F:\AI\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "F:\AI\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "F:\AI\Fooocus\modules\default_pipeline.py", line 361, in process_diffusion
sampled_latent = core.ksampler(
File "F:\AI\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "F:\AI\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "F:\AI\Fooocus\modules\core.py", line 315, in ksampler
samples = fcbh.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
File "F:\AI\Fooocus\backend\headless\fcbh\sample.py", line 93, in sample
real_model, positive_copy, negative_copy, noise_mask, models = prepare_sampling(model, noise.shape, positive, negative, noise_mask)
File "F:\AI\Fooocus\backend\headless\fcbh\sample.py", line 86, in prepare_sampling
fcbh.model_management.load_models_gpu([model] + models, model.memory_required(noise_shape) + inference_memory)
File "F:\AI\Fooocus\modules\patch.py", line 494, in patched_load_models_gpu
y = fcbh.model_management.load_models_gpu_origin(*args, **kwargs)
File "F:\AI\Fooocus\backend\headless\fcbh\model_management.py", line 410, in load_models_gpu
cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
File "F:\AI\Fooocus\backend\headless\fcbh\model_management.py", line 293, in model_load
raise e
File "F:\AI\Fooocus\backend\headless\fcbh\model_management.py", line 289, in model_load
self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU
File "F:\AI\Fooocus\backend\headless\fcbh\model_patcher.py", line 191, in patch_model
temp_weight = fcbh.model_management.cast_to_device(weight, device_to, torch.float32, copy=True)
File "F:\AI\Fooocus\backend\headless\fcbh\model_management.py", line 532, in cast_to_device
return tensor.to(device, copy=copy).to(dtype)
RuntimeError: Could not allocate tensor with 26214400 bytes. There is not enough GPU video memory available!
Total time: 111.67 seconds

@Dino-Kupinic
Copy link

Dino-Kupinic commented Nov 30, 2023

same issue.
AMD 6700 XT

To create a public link, set share=True in launch().
Using directml with device:
Total VRAM 12288 MB, total RAM 32535 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: privateuseone
...
RuntimeError: Could not allocate tensor with 158597120 bytes. There is not enough GPU video memory available!
Total time: 9.34 seconds

@lowlyphe
Copy link

Fixed my issue. I followed the AMD instructions but have an Nvidia GPU

@napstar-420
Copy link
Author

I hope the developers solve this issue for AMD users else i have to learn python and tensor and all the lib first before solving it myself

@Arakis14
Copy link

Arakis14 commented Dec 1, 2023

With some simple debugging I managed to pinpoint the issue (my issue, but I believe we all struggle with the same problem) to a loop being done over and over again until GPU runs out of memory.
As mentioned earlier, the last log seen is: "Loading 1 new model".

So, def load_models_gpu in model_management.py
https://github.com/lllyasviel/Fooocus/blob/main/backend/headless/fcbh/model_management.py#L410

def model_load in model_management.py
https://github.com/lllyasviel/Fooocus/blob/main/backend/headless/fcbh/model_management.py#L289

def patch_model in model_patcher.py (the loop)
https://github.com/lllyasviel/Fooocus/blob/main/backend/headless/fcbh/model_patcher.py#L178C4-L178C4

I hope that helps the developers.

@GroupXyz1
Copy link

GroupXyz1 commented Dec 4, 2023

I think i found a solution for one of the problems, tough it comes with a downside!

Found this in the console:

Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention

And since i'm using this startargument it's not stopping randomly anymore on the final stage.

So: --use-split-cross-attention

I know this won't fix the issue of it allocating too much ram, but at least one issue.

The downside is, you can't cancel or skip generation anymore.

Hope this helped some people, let's keep trying to find a complete solution!

EDIT:
ok probaply it was a bug but now i can cancel and skip again, so maybe it has no downside

@strobya
Copy link

strobya commented Dec 5, 2023

@GroupXyz1 What is the name of the file that i need to edit to input --use-split-cross-attention

@GroupXyz1
Copy link

@strobya The file you start Fooocus with, so the run.bat or run_anime.bat or run_realistic.bat.
As an example the run.bat looks like this for me:

.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml --use-split-cross-attention
pause

@strobya
Copy link

strobya commented Dec 5, 2023

@GroupXyz1 Alrigth, did that and go this error

Successfully installed torch-2.0.0 torchvision-0.15.1

C:\Users\mypc3\Downloads\Compressed\AI photo>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml--use-split-cross-attention
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\\entry_with_update.py', '--directml--use-split-cross-attention']
usage: entry_with_update.py [-h] [--listen [IP]] [--port PORT] [--enable-cors-header [ORIGIN]]
                            [--max-upload-size MAX_UPLOAD_SIZE] [--extra-model-paths-config PATH [PATH ...]]
                            [--output-directory OUTPUT_DIRECTORY] [--temp-directory TEMP_DIRECTORY]
                            [--input-directory INPUT_DIRECTORY] [--auto-launch] [--disable-auto-launch]
                            [--cuda-device DEVICE_ID] [--cuda-malloc | --disable-cuda-malloc]
                            [--dont-upcast-attention] [--force-fp32 | --force-fp16] [--bf16-unet]
                            [--fp16-vae | --fp32-vae | --bf16-vae]
                            [--fp8_e4m3fn-text-enc | --fp8_e5m2-text-enc | --fp16-text-enc | --fp32-text-enc]
                            [--directml [DIRECTML_DEVICE]] [--disable-ipex-optimize]
                            [--preview-method [none,auto,latent2rgb,taesd]]
                            [--use-split-cross-attention | --use-quad-cross-attention | --use-pytorch-cross-attention]
                            [--disable-xformers]
                            [--gpu-only | --highvram | --normalvram | --lowvram | --novram | --cpu]
                            [--disable-smart-memory] [--dont-print-server] [--quick-test-for-ci]
                            [--windows-standalone-build] [--disable-metadata] [--share] [--preset PRESET]
                            [--language LANGUAGE] [--enable-smart-memory] [--theme THEME] [--disable-image-log]
entry_with_update.py: error: unrecognized arguments: --directml--use-split-cross-attention

C:\Users\mypc3\Downloads\Compressed\AI photo>pause
Press any key to continue . . .

@Meganton
Copy link

Meganton commented Dec 5, 2023

You have to put a space in between the arguments. But if RAM is your issue, this doesn't help

@GroupXyz1
Copy link

@Meganton It does indeed help with RAM issues, but only during the attention-phase (i have no name for this so i call it like the argument) which is only short before the image is ready, so it will help if your image is almost done and then the error with ram is coming, not if it happens already during generation (between step 1-29 ig)
I know this is a bit confusing but i tried my best explaining it understandable :/
So just put this in your arguments it doesnt hurt and if it does fix be happy if, not... yeah... thats what we're all waiting for to get fixed lol

@RAICircles
Copy link

@strobya The file you start Fooocus with, so the run.bat or run_anime.bat or run_realistic.bat. As an example the run.bat looks like this for me:

.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml --use-split-cross-attention
pause

I have tried that yet it still says there isn't enough space for 10 MB.
I have allocated 4 GB of my 8, so that shouldn't be the problem.

@GroupXyz1
Copy link

GroupXyz1 commented Dec 5, 2023

@RAICircles yeah it only fixes if it happens at the end of generation, so for me it worked restarting it every single time after one generated image, and after some time it juste worked fine, but if it doesnt work at all you will have to wait for a fix of the script from the creator, or someone else who has knowledge with that. I also suspect that it differs when you use normal or anime or realistic, so you might try another one of them if it works!

@Krader12
Copy link

Krader12 commented Dec 6, 2023

I've tried everything here, but all im getting is the RuntimeError: Could not allocate tensor with 192397120 bytes. There is not enough GPU video memory available!

@GroupXyz1
Copy link

I've tried everything here, but all im getting is the RuntimeError: Could not allocate tensor with 192397120 bytes. There is not enough GPU video memory available!

yeah i have no idea either on how to fix that broken code,
can someone just please look over this? i mean man it's github can't be that no one has time

@Dino-Kupinic
Copy link

I think i found a solution for one of the problems, tough it comes with a downside!

Found this in the console:

Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention

And since i'm using this startargument it's not stopping randomly anymore on the final stage.

So: --use-split-cross-attention

I know this won't fix the issue of it allocating too much ram, but at least one issue.

The downside is, you can't cancel or skip generation anymore.

Hope this helped some people, let's keep trying to find a complete solution!

EDIT: ok probaply it was a bug but now i can cancel and skip again, so maybe it has no downside

Did not work for me. VRAM issue remains for me. 6700 XT

@GroupXyz1
Copy link

@jamesbychance you are installing torch and not running the programm, you need to use the argument while running the cmd like this:

.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml --use-split-cross-attention
pause

what you are trying is installing torch with that argument and that won't work because it's something different than starting fooocus, you maybe accidentally copied the wrong line.

@Crunch91
Copy link

AMD 6700XT

My run.bat looks like this:

.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y
.\python_embeded\python.exe -m pip install torch-directml
.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml --use-split-cross-attention
pause

I still get this error:

.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml --use-split-cross-attention

entry_with_update.py: error: unrecognized arguments: --use-split-cross-attention

@Alexand3r
Copy link

@Crunch91 --use-split-cross-attention has been changed(?) to --attention-split

@Crunch91
Copy link

@Crunch91 --use-split-cross-attention has been changed(?) to --attention-split

At least I have now the same error as the others already mentioned 😄

RuntimeError: Could not allocate tensor with 299847680 bytes. There is not enough GPU video memory available!
Total time: 85.63 seconds

@mashb1t
Copy link
Collaborator

mashb1t commented Dec 30, 2023

Good news everybody: As of 8e62a72 (latest Fooocus version 2.1.857) AMD with >= 8GB VRAM is now supported.
Please try with min. 8GB VRAM allocated and post if it now works for you. Thanks!

@mashb1t mashb1t added bug Something isn't working question Further information is requested labels Dec 30, 2023
@strobya
Copy link

strobya commented Dec 30, 2023

@mashb1t Deleted old files and redownloaded

.\python_embeded\Lib\site-packages\torchsde_brownian\brownian_interval.py

Line 32

generator = torch.Generator(device).manual_seed(int(seed))

to

generator = torch.Generator().manual_seed(int(seed))

then tried to alloctae more VRAM

Go to \Fooocus\backend\headless\fcbh\model_management.py

In line 95 change mem_total = 1024 * 1024 * 1024 to mem_total = 8192 * 1024 * 1024.

Now Im getting this error

C:\Users\mypc3\Downloads\Compressed\Ai>.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y
Found existing installation: torch 2.0.0
Uninstalling torch-2.0.0:
  Successfully uninstalled torch-2.0.0
Found existing installation: torchvision 0.15.1
Uninstalling torchvision-0.15.1:
  Successfully uninstalled torchvision-0.15.1
WARNING: Skipping torchaudio as it is not installed.
WARNING: Skipping torchtext as it is not installed.
WARNING: Skipping functorch as it is not installed.
WARNING: Skipping xformers as it is not installed.

C:\Users\mypc3\Downloads\Compressed\Ai>.\python_embeded\python.exe -m pip install torch-directml
Requirement already satisfied: torch-directml in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (0.2.0.dev230426)
Collecting torch==2.0.0 (from torch-directml)
  Using cached torch-2.0.0-cp310-cp310-win_amd64.whl (172.3 MB)
Collecting torchvision==0.15.1 (from torch-directml)
  Using cached torchvision-0.15.1-cp310-cp310-win_amd64.whl (1.2 MB)
Requirement already satisfied: filelock in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.12.2)
Requirement already satisfied: typing-extensions in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (4.7.1)
Requirement already satisfied: sympy in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (1.12)
Requirement already satisfied: networkx in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1)
Requirement already satisfied: jinja2 in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1.2)
Requirement already satisfied: numpy in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (1.23.5)
Requirement already satisfied: requests in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (2.31.0)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (9.2.0)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from jinja2->torch==2.0.0->torch-directml) (2.1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.1.0)
Requirement already satisfied: idna<4,>=2.5 in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2.0.3)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2023.5.7)
Requirement already satisfied: mpmath>=0.19 in c:\users\mypc3\downloads\compressed\ai\python_embeded\lib\site-packages (from sympy->torch==2.0.0->torch-directml) (1.3.0)
DEPRECATION: torchsde 0.2.5 has a non-standard dependency specifier numpy>=1.19.*; python_version >= "3.7". pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of torchsde or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
Installing collected packages: torch, torchvision
  WARNING: The scripts convert-caffe2-to-onnx.exe, convert-onnx-to-caffe2.exe and torchrun.exe are installed in 'C:\Users\mypc3\Downloads\Compressed\Ai\python_embeded\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed torch-2.0.0 torchvision-0.15.1

[notice] A new release of pip is available: 23.2.1 -> 23.3.2
[notice] To update, run: C:\Users\mypc3\Downloads\Compressed\Ai\python_embeded\python.exe -m pip install --upgrade pip

C:\Users\mypc3\Downloads\Compressed\Ai>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\\entry_with_update.py', '--directml']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.857
Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
Using directml with device:
Total VRAM 8192 MB, total RAM 32655 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --attention-split
Refiner unloaded.
model_type EPS
UNet ADM Dimension 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection'}
Base model loaded: C:\Users\mypc3\Downloads\Compressed\Ai\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [C:\Users\mypc3\Downloads\Compressed\Ai\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [C:\Users\mypc3\Downloads\Compressed\Ai\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [C:\Users\mypc3\Downloads\Compressed\Ai\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 8416032143536669622
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] man on the moon, intricate, elegant, highly detailed, extremely colorful, warm light, cinematic, dramatic ambient, professional, artistic, sharp focus, fair composition, clear, crisp, dynamic, full color, modern, sleek, amazing, symmetry, great striking, perfect, epic, best,, background, fine detail, created, sunny, futuristic, marvelous, thought
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] man on the moon, futuristic, stunning, highly detailed, clear, sharp, perfect, focus, intricate, fine detail, elegant, dynamic light, agile, professional, cinematic, complex, glowing, amazing, symmetry, illuminated, color, coherent, vivid colors, ambient, beautiful, focused, pretty, attractive, epic, best, winning, dramatic, saturated, enhanced
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 18.15 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 64.0
[Fooocus Model Management] Moving model(s) has taken 11.73 seconds
  0%|                                                                                           | 0/30 [00:00<?, ?it/s]
C:\Users\mypc3\Downloads\Compressed\Ai>pause
Press any key to continue . . .

@lllyasviel
Copy link
Owner

@strobya https://github.com/lllyasviel/Fooocus/blob/main/troubleshoot.md

@mashb1t
Copy link
Collaborator

mashb1t commented Dec 30, 2023

@strobya if you still have backend\headless\fcbh, you're not using the latest version of Fooocus, where fcbh has been replaced by ldm_patched.
Please make sure to pull all changes of the new version before you start testing.

@TimH1502
Copy link

for me the latest versions works.

8 GB VRAM 32 GB RAM and a RX5700

@strobya
Copy link

strobya commented Dec 31, 2023

@mashb1t my bad that was a typo, it shouldve been:

\Fooocus\ldm_patched\modules

In line 95 change mem_total = 1024 * 1024 * 1024 to mem_total = 8192 * 1024 * 1024.

Now, its working but I get an error halfway:

"Error
Connection errored out."------On the app

"C:\Users\mypc3\Downloads\Compressed\Ai>pause
Press any key to continue . . ."----------On Run.bat

I tried the toublshoot option provided by @lllyasviel (System swap) and Im still getting the same error mention above

@strobya
Copy link

strobya commented Dec 31, 2023

@mashb1t @lllyasviel Nevermind, I restarted my pc and it worked. Thanks guys 👊😎

@mashb1t mashb1t closed this as completed Dec 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests