-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow explicitly selecting xFormers at install time #7065
Conversation
I feel like whether we use xformers should be controlled via the config rather than whether xformers is installed. Is there a reason for doing it this way? |
I think we can handle choosing the correct attention type at runtime: diff --git a/invokeai/backend/stable_diffusion/diffusers_pipeline.py b/invokeai/backend/stable_diffusion/diffusers_pipeline.py
index 646e1a92d..06a66f64b 100644
--- a/invokeai/backend/stable_diffusion/diffusers_pipeline.py
+++ b/invokeai/backend/stable_diffusion/diffusers_pipeline.py
@@ -195,7 +195,14 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
# the remainder if this code is called when attention_type=='auto'
if self.unet.device.type == "cuda":
- if is_xformers_available():
+ # On 30xx and 40xx series GPUs, `torch-sdp` is faster than `xformers`. This corresponds to a CUDA major
+ # version of 8 or higher. So, for major version 7 or below, we prefer `xformers`.
+ # See:
+ # - https://developer.nvidia.com/cuda-gpus
+ # - https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities
+ prefer_xformers = torch.cuda.get_device_properties("cuda").major <= 7
+
+ if is_xformers_available() and prefer_xformers:
self.enable_xformers_memory_efficient_attention()
return
elif hasattr(torch.nn.functional, "scaled_dot_product_attention"): |
This was a decision made in a Slack discussion, but I'm happy to do it whichever way preferable. I think not having xformers installed at all is the safest choice, because we can't control which 3rd-party code might use it if detected. I like the runtime decision approach suggested by @psychedelicious. Can easily add that to the PR, but also think we shouldn't be installing it unless necessary. |
I added the suggested runtime change in the latest commit |
I guess if for some reason a person with a new card wanted to use xformers, they'd have to download it manually 🤔 As much as I am a fan of not installing something that won't be used the vast majority of the time |
There's no good reason for someone with a modern GPU to use xformers though, is there? As much as I'd like to accommodate that individual who swaps their newer (30xx+) GPU for an older (20xx-) GPU and wants to continue seamlessly using Invoke with the same performance profile.... not convinced that's an edge case we really need to account for ;) |
0fc785b
to
4301812
Compare
b20b098
to
ccde9f7
Compare
Summary
A recent Torch upgrade introduced a performance regression on modern Nvidia GPUs when using xformers.
This PR adds an install option to select xFormers for older GPUs, where it's needed, and avoid it for the newer cards that can use
torch-sdp
or other attention types.This PR also points
torch
to the CUDA v12.4 extra-index-url for cpu-only Docker builds (not related to xformers issues).QA Instructions
The updated installer, built from this PR, is attached here.. Please test it on Windows and macOS for completeness. On the mac it should simply proceed with the installation without displaying the GPU selection menu, and should result in a working install.
InvokeAI-installer-v5.1.0.zip
install.sh
xformers
is installed (pip freeze | grep xformers
)xformers
is NOT installed.Merge Plan
Merge anytime
Checklist