Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD 6700XT 12GB DML allocator out of memory. #835

Open
uzior opened this issue Nov 1, 2023 · 14 comments
Open

AMD 6700XT 12GB DML allocator out of memory. #835

uzior opened this issue Nov 1, 2023 · 14 comments
Labels
bug (AMD) Something isn't working (AMD specific)

Comments

@uzior
Copy link

uzior commented Nov 1, 2023

Hi!
I realize that AMD devices are still in the beta phase, but the problem I encountered seems relatively easy to solve, namely:

Everything starts correctly, the generation process starts, but when it allocates VRAM memory after exceeding 7GB, the system crashes.

Any suggestions? I will be grateful for all suggestions.
log below.

Already up-to-date
Update succeeded.
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.771
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
Using directml with device:
Total VRAM 1024 MB, total RAM 32694 MB
Set vram state to: NORMAL_VRAM
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
[Fooocus] Disabling smart memory
Refiner unloaded.
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
missing {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
Base model loaded: E:\VM\AI\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors
LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.1), ('None', 0.1), ('None', 0.1), ('None', 0.1), ('None', 0.1)]
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 1
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 139978393905596030
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 24
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
Preparation time: 3.72 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.02916753850877285, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[W D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\engine\dml_heap_allocator.cc:120] DML allocator out of memory!
[W D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\engine\dml_heap_allocator.cc:120] DML allocator out of memory!

@uzior uzior changed the title AMD 6700XT 12GB out of memory. AMD 6700XT 12GB DML allocator out of memory. Nov 1, 2023
@lllyasviel lllyasviel added the bug Something isn't working label Nov 5, 2023
@f-klement
Copy link

I have a similar problem on my RX 6800S, the laptop runs out of memory and crashes at every run

@lukechar
Copy link

lukechar commented Dec 4, 2023

Same issue on my RX 6650 XT - "DML allocator out of memory!"

@PowerZones
Copy link

Same on RX580 8gb

@jhoyocartes
Copy link

Same on RX580 8gb too

@quoije
Copy link

quoije commented Jan 5, 2024

Same issue with AMD 6700XT.

EDIT: Not anymore, fix my issue by setting the page file to automatic and insure that I have space available on my disk. The same gpu was coincidental and probably not related to OP issue.

Windows 11

@Crunch91
Copy link

Crunch91 commented Jan 7, 2024

Same here.
My specs:

6700 XT (12gb)
16GB RAM
24576 MB swap file
SSD
Windows

DML allocator out of memory!

@ptrkrnstnr
Copy link

same here...
7900xtx (24GB), 7800x3d, 32 GB RAM, 26624 MB swapfile, SSD, Windows 10

@TobiWan-Kenobi
Copy link

TobiWan-Kenobi commented Jan 11, 2024

Same issue here with Intel(R) UHD Graphics GPU 8GB ... Will I have any chance of running this with this kind of GPU at all? (Windows 11)

@OdinM13
Copy link

OdinM13 commented Jan 17, 2024

I have the same issue while having 32gb ram and radeon rx6800xt. The strange thing is, that it worked properly before. For a few days I could generate as many pictures as I desired with all kinds of different settings, but now this is no longer possbile and I don't know why

@ptrkrnstnr
Copy link

I have a solution: go back to version 2.1.851, modify the "run.bat" to disable all updates like so:

:: .\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y :: .\python_embeded\python.exe -m pip install torch-directml .\python_embeded\python.exe -s Fooocus\launch.py --directml pause

you can download the files of v 2.1.851 by selecting the right branch and just copy the files over a fresh installation.

@lgwjames
Copy link

I have a solution: go back to version 2.1.851, modify the "run.bat" to disable all updates like so:

:: .\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y :: .\python_embeded\python.exe -m pip install torch-directml .\python_embeded\python.exe -s Fooocus\launch.py --directml pause

you can download the files of v 2.1.851 by selecting the right branch and just copy the files over a fresh installation.

Tried this today as installed on bootcamp with rx580 8gb

get this

Traceback (most recent call last):
File "C:\Program Files\Fooocus\Fooocus\modules\async_worker.py", line 803, in worker
handler(task)
File "C:\Program Files\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Program Files\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Program Files\Fooocus\Fooocus\modules\async_worker.py", line 735, in handler
imgs = pipeline.process_diffusion(
File "C:\Program Files\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Program Files\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Program Files\Fooocus\Fooocus\modules\default_pipeline.py", line 361, in process_diffusion
sampled_latent = core.ksampler(
File "C:\Program Files\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Program Files\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Program Files\Fooocus\Fooocus\modules\core.py", line 313, in ksampler
samples = ldm_patched.modules.sample.sample(model,
File "C:\Program Files\Fooocus\Fooocus\ldm_patched\modules\sample.py", line 93, in sample
real_model, positive_copy, negative_copy, noise_mask, models = prepare_sampling(model, noise.shape, positive, negative, noise_mask)
File "C:\Program Files\Fooocus\Fooocus\ldm_patched\modules\sample.py", line 86, in prepare_sampling
ldm_patched.modules.model_management.load_models_gpu([model] + models, model.memory_required([noise_shape[0] * 2] + list(noise_shape[1:])) + inference_memory)
File "C:\Program Files\Fooocus\Fooocus\modules\patch.py", line 441, in patched_load_models_gpu
y = ldm_patched.modules.model_management.load_models_gpu_origin(*args, **kwargs)
File "C:\Program Files\Fooocus\Fooocus\ldm_patched\modules\model_management.py", line 414, in load_models_gpu
cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
File "C:\Program Files\Fooocus\Fooocus\ldm_patched\modules\model_management.py", line 297, in model_load
raise e
File "C:\Program Files\Fooocus\Fooocus\ldm_patched\modules\model_management.py", line 293, in model_load
self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU
File "C:\Program Files\Fooocus\Fooocus\ldm_patched\modules\model_patcher.py", line 198, in patch_model
temp_weight = ldm_patched.modules.model_management.cast_to_device(weight, device_to, torch.float32, copy=True)
File "C:\Program Files\Fooocus\Fooocus\ldm_patched\modules\model_management.py", line 587, in cast_to_device
return tensor.to(device, copy=copy, non_blocking=non_blocking).to(dtype, non_blocking=non_blocking)
RuntimeError: Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available!
Total time: 13.93 seconds

@patientx
Copy link

strangely while I am not getting any memory errors (8 GB 6600) my friend does , (16 GB 6800XT). Both of us also have fast nvme drives as swap drives. And 16 GB system memory.

@mashb1t mashb1t added bug (AMD) Something isn't working (AMD specific) and removed bug Something isn't working labels Feb 22, 2024
@Lysuwel
Copy link

Lysuwel commented Jul 6, 2024

It maybe use memory,not GPU's memory,I had add some ‘Virtual memory’ swapfile ,and then it's work!

@gasperpb
Copy link

python main.py --directml --use-split-cross-attention --lowvram

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug (AMD) Something isn't working (AMD specific)
Projects
None yet
Development

No branches or pull requests