Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD Windows Run Error #1263

Closed
NakanoSanku opened this issue Dec 7, 2023 · 5 comments
Closed

AMD Windows Run Error #1263

NakanoSanku opened this issue Dec 7, 2023 · 5 comments
Labels
duplicate This issue or pull request already exists

Comments

@NakanoSanku
Copy link

Describe the problem
Error
Full Console Log

(base) PS C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791> .\run.bat

C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\\entry_with_update.py', '--directml']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.824
Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
Using directml with device:
Total VRAM 1024 MB, total RAM 16253 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Refiner unloaded.
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra keys {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 1625407263499467761
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] one girl, light cute detailed perfect formal, delicate, charming, pretty, background composed beautiful intricate, amazing composition, highly saturated colors, elegant, sharp focus, professional, enhanced quality, fine detail, joyful, epic, stunning, gorgeous, creative, positive, artistic, loving, marvelous, pure, shiny, brilliant, bright, polished, complex, awesome, colorful, flowing
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] one girl, full focus, bright glowing, intricate, elegant, highly detailed, beautiful, delicate, atmosphere, fancy, sharp detail, cinematic, illuminated, amazing, symmetry, deep light, translucent, very coherent, cute, ambient background, iconic, fine balanced colors, perfect composition, artistic, innocent, sublime, complex, determined, extremely inspirational, intriguing, cheerful, creative
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 10.80 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Traceback (most recent call last):
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 803, in worker
    handler(task)
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 735, in handler
    imgs = pipeline.process_diffusion(
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 354, in process_diffusion
    modules.patch.BrownianTreeNoiseSamplerPatched.global_init(
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\Fooocus\modules\patch.py", line 173, in global_init
    BrownianTreeNoiseSamplerPatched.tree = BatchedBrownianTree(x, t0, t1, seed, cpu=cpu)
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\k_diffusion\sampling.py", line 85, in __init__
    self.trees = [torchsde.BrownianTree(t0, w0, t1, entropy=s, **kwargs) for s in seed]
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\k_diffusion\sampling.py", line 85, in <listcomp>
    self.trees = [torchsde.BrownianTree(t0, w0, t1, entropy=s, **kwargs) for s in seed]
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torchsde\_brownian\derived.py", line 155, in __init__
    self._interval = brownian_interval.BrownianInterval(t0=t0,
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py", line 540, in __init__
    W = self._randn(initial_W_seed) * math.sqrt(t1 - t0)
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py", line 234, in _randn
    return _randn(size, self._top._dtype, self._top._device, seed)
  File "C:\Users\KateT\Downloads\Compressed\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py", line 32, in _randn
    generator = torch.Generator(device).manual_seed(int(seed))
RuntimeError: Device type privateuseone is not supported for torch.Generator() api.
@ItsNoted
Copy link

ItsNoted commented Dec 7, 2023

Have a look here #763

@grendahl06
Copy link

from following a few of these threads, I saw a hint to pass the --lowvram command to start the bat file.

When I did, my CPU (AMD 7950) spiked to max and my 32GB of ram spiked. The computer was unresponsive for ~72 seconds and then gave the following error:
File "D:\Fooocus_win64\python_embeded\lib\site-packages\torch\cuda_init_.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
Total time: 72.06 seconds

Can anyone advise on what I need to do or try next? My GPU is a Radeon 6650 XT. It had no usage during the execution of the prompt.

Thank you in advance

@NakanoSanku
Copy link
Author

通过关注其中的一些线程,我看到了一个提示,即传递 --lowvram 命令来启动 bat 文件。

当我这样做时,我的 CPU (AMD 7950) 飙升到最大,我的 32GB 内存飙升。计算机在 ~72 秒内无响应,然后出现以下错误:文件“D:\Fooocus_win64\python_embeded\lib\site-packages\torch\cuda_init_.py”,第 239 行,在 _lazy_init 中raise AssertionError(“未在启用 CUDA 的情况下编译 Torch”)AssertionError:未在启用 CUDA 的情况下编译 Torch总时间:72.06 秒

谁能建议我接下来需要做什么或尝试什么?我的 GPU 是 Radeon 6650 XT。在执行提示期间,它没有使用。

先谢谢你

I haven't checked the source code yet, but I guess the maximum thread should be used to load the model at startup. I don't know why it loads so quickly. This can easily cause the computer to crash.

@grendahl06
Copy link

grendahl06 commented Dec 10, 2023

I reverted the memory back 1024 * 1024 * 1024. The Memory no longer nearly crashes my PC.

in model_patcher.py I changed line 191 to match the else; but I still get an exception where this code is not allocating memory from the GPU:

File "D:\Fooocus_win64\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1143, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: Could not allocate tensor with 6553600 bytes. There is not enough GPU video memory available!
Total time: 26.84 seconds

I also noticed that as soon as the prompt clears the RAM (which still spikes to using all 32GB), the program allocates all remaining GPU memory just before making the claim there is no memory. Some part of the code (sorry, python isn't my language) appears to allocate all available memory but then not use it rather than either using what it has or simply using what it needs.

@mashb1t
Copy link
Collaborator

mashb1t commented Dec 29, 2023

Duplicate of #1278 and #1304

@mashb1t mashb1t added the duplicate This issue or pull request already exists label Dec 29, 2023
@mashb1t mashb1t closed this as not planned Won't fix, can't repro, duplicate, stale Dec 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

4 participants