Forcing Torch Version to 1.13.1 for RX 5000 series GPUs #11048

DGdev91 · 2023-06-05T23:22:19Z

Description

As described on the discussion under #10465, this is an attempt to make work the WebUI again on older AMD cards like the RX5000 series.

The code forces TORCH_COMMAND for Navi and Renoir GPUs.

Also, the older pytorch version requires python version <= 3.10. I added some code to check it, so the user gets a cleaner error message when trying to run it on python 3.11 or later

This is still intended as a workaround, to be removed as soon as someone finds a way to run the webui properly on these cards.

Should fix #10873

Checklist:

I have read contributing wiki page
I have performed a self-review of my own code
My code follows the style guidelines
My code passes tests

GhostNaN · 2023-06-06T05:32:45Z

Why do you want to force the TORCH_COMMAND for Navi 2 cards if your issue is with Navi 1?
Also, I've had zero issues with running torch>= 2.0 with rocm 5.4.2 and python >= 3.11 on an RX 6700 XT as long as HSA_OVERRIDE_GFX_VERSION=10.3.0 is specified.

AUTOMATIC1111 · 2023-06-06T05:34:40Z

Use python_cmd env var for python and also I would prefer if there wasn't same code copied twice and if running python to get version wasn't done until it's needed.

DGdev91 · 2023-06-06T08:00:01Z

Ok, i was thinking the HSA_OVERRIDE_GFX_VERSION was causing the issue, but there's must be something else going on.
Now i'm applying the code only for Navi1 and using python_cmd as suggested.

AUTOMATIC1111 · 2023-06-07T04:49:20Z

Any more feedback from AMD users?

cyatarow · 2023-06-07T05:53:43Z

I'm the author of #10855.
Out of curiosity, why is RX 5000 series that incompatible with torch 2.0?

DGdev91 · 2023-06-07T08:13:27Z

I'm the author of #10855. Out of curiosity, why is RX 5000 series that incompatible with torch 2.0?

That's a good question. My first guess is that HSA_OVERRIDE_GFX_VERSION was causing problems, but that's also true for RX 6000, wich is working just fine.

Sooo.... Who knows.

We can't even be really sure it's just RX 5000, maybe there are other series wich have problems but no one has reported it yet

GabrielDTB · 2023-06-09T08:50:50Z

RX 5700 user here. Arch Linux with ROCm 5.4.3. CPU generation worked fine but when using the GPU, generation would make 0 progress. CLI output gave nothing and the webui reported rightfully that nothing was happening. Radeontop would report Shader Interpolator pegged at 100%, Memory Clock would fluctuate between 0 and 100%, and Shader Clock would be pegged at 100%. VRAM at ~4500M with Stable Diffusion 1.5 iirc. Attempting to switch Stable Diffusion checkpoint through the webui would tick up indefinitely with an expected time of 1.5s on bottom.

Changing to Python 3.10.12 and applying this patch makes everything functional. Small note: I don't have bc installed, so I had to cut out the check for whether Python 3.10 or else install it. There's probably a more portable solution.

cyatarow · 2023-06-11T10:21:54Z

But...is there really no way to work around the issue other than fixing torch to 1.13.1?
Could it be that since RDNA1 is not officially supported by ROCm, torch 2.0 was developed without any consideration of RDNA1??

Forcing Torch Version to 1.13.1 for Navi and Renoir GPUs

8d98532

DGdev91 requested a review from AUTOMATIC1111 as a code owner June 5, 2023 23:22

This was referenced Jun 5, 2023

Bump pytorch to 2.0 for AMD Users on Linux #10465

Merged

[Bug]: gfx906 ROCM won't work with torch: 2.0.1+rocm5.4.2 but works with other AIs #10873

Open

Fix error in webui.sh

2788ce8

Force python1 for Navi1 only, use python_cmd for python

e0d923b

DGdev91 changed the title ~~Forcing Torch Version to 1.13.1 for Navi and Renoir GPUs~~ Forcing Torch Version to 1.13.1 for Navi 1 (RX 5000 series) GPUs Jun 6, 2023

Check python version for Navi 1 only

95d4d65

DGdev91 changed the title ~~Forcing Torch Version to 1.13.1 for Navi 1 (RX 5000 series) GPUs~~ Forcing Torch Version to 1.13.1 for RX 5000 series GPUs Jun 6, 2023

DGdev91 added 2 commits June 6, 2023 10:03

Write "RX 5000 Series" instead of "Navi" in err

8646768

Skip force pyton and pytorch ver if TORCH_COMMAND already set

62860c2

DGdev91 mentioned this pull request Jun 6, 2023

[Bug]: Image generation won't start forever (Linux+ROCm, possibly specific to RX 5000 series) #10855

Open

1 task

AUTOMATIC1111 approved these changes Jun 9, 2023

View reviewed changes

AUTOMATIC1111 merged commit 741bd71 into AUTOMATIC1111:dev Jun 9, 2023

DGdev91 deleted the force_python1_navi_renoir branch June 12, 2023 00:14

cl0ck-byte mentioned this pull request Aug 1, 2023

[Bug]: Launch Prevention after recent update #12223

Closed

1 task

MrLavender mentioned this pull request Feb 19, 2024

Update to ROCm5.7 and PyTorch #14820

Merged

DGdev91 mentioned this pull request Mar 11, 2024

Better workaround for Navi1, removing --pre for Navi3 #15224

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forcing Torch Version to 1.13.1 for RX 5000 series GPUs #11048

Forcing Torch Version to 1.13.1 for RX 5000 series GPUs #11048

DGdev91 commented Jun 5, 2023

GhostNaN commented Jun 6, 2023

AUTOMATIC1111 commented Jun 6, 2023

DGdev91 commented Jun 6, 2023

AUTOMATIC1111 commented Jun 7, 2023 •

edited

Loading

cyatarow commented Jun 7, 2023

DGdev91 commented Jun 7, 2023 •

edited

Loading

GabrielDTB commented Jun 9, 2023

cyatarow commented Jun 11, 2023

Forcing Torch Version to 1.13.1 for RX 5000 series GPUs #11048

Forcing Torch Version to 1.13.1 for RX 5000 series GPUs #11048

Conversation

DGdev91 commented Jun 5, 2023

Description

Checklist:

GhostNaN commented Jun 6, 2023

AUTOMATIC1111 commented Jun 6, 2023

DGdev91 commented Jun 6, 2023

AUTOMATIC1111 commented Jun 7, 2023 • edited Loading

cyatarow commented Jun 7, 2023

DGdev91 commented Jun 7, 2023 • edited Loading

GabrielDTB commented Jun 9, 2023

cyatarow commented Jun 11, 2023

AUTOMATIC1111 commented Jun 7, 2023 •

edited

Loading

DGdev91 commented Jun 7, 2023 •

edited

Loading