Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow explicitly selecting xFormers at install time #7065

Merged
merged 4 commits into from
Oct 10, 2024

Conversation

ebr
Copy link
Member

@ebr ebr commented Oct 8, 2024

Summary

A recent Torch upgrade introduced a performance regression on modern Nvidia GPUs when using xformers.
This PR adds an install option to select xFormers for older GPUs, where it's needed, and avoid it for the newer cards that can use torch-sdp or other attention types.

This PR also points torch to the CUDA v12.4 extra-index-url for cpu-only Docker builds (not related to xformers issues).

QA Instructions

The updated installer, built from this PR, is attached here.. Please test it on Windows and macOS for completeness. On the mac it should simply proceed with the installation without displaying the GPU selection menu, and should result in a working install.

InvokeAI-installer-v5.1.0.zip

  1. Unzip the installer and run it using install.sh
  2. Install using option 2 (older nvidia GPU with xformers).
  3. Activate the virtual environment and validate that xformers is installed (pip freeze | grep xformers)
  4. Repeat this with option 1, and validate that xformers is NOT installed.

Merge Plan

Merge anytime

Checklist

  • The PR has a short but descriptive title, suitable for a changelog

@github-actions github-actions bot added docker installer PRs that change the installer labels Oct 8, 2024
@RyanJDick
Copy link
Collaborator

I feel like whether we use xformers should be controlled via the config rather than whether xformers is installed. Is there a reason for doing it this way?

@psychedelicious
Copy link
Collaborator

I think we can handle choosing the correct attention type at runtime:

diff --git a/invokeai/backend/stable_diffusion/diffusers_pipeline.py b/invokeai/backend/stable_diffusion/diffusers_pipeline.py
index 646e1a92d..06a66f64b 100644
--- a/invokeai/backend/stable_diffusion/diffusers_pipeline.py
+++ b/invokeai/backend/stable_diffusion/diffusers_pipeline.py
@@ -195,7 +195,14 @@ class StableDiffusionGeneratorPipeline(StableDiffusionPipeline):
 
         # the remainder if this code is called when attention_type=='auto'
         if self.unet.device.type == "cuda":
-            if is_xformers_available():
+            # On 30xx and 40xx series GPUs, `torch-sdp` is faster than `xformers`. This corresponds to a CUDA major
+            # version of 8 or higher. So, for major version 7 or below, we prefer `xformers`.
+            # See:
+            # - https://developer.nvidia.com/cuda-gpus
+            # - https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities
+            prefer_xformers = torch.cuda.get_device_properties("cuda").major <= 7
+
+            if is_xformers_available() and prefer_xformers:
                 self.enable_xformers_memory_efficient_attention()
                 return
             elif hasattr(torch.nn.functional, "scaled_dot_product_attention"):

@ebr
Copy link
Member Author

ebr commented Oct 9, 2024

I feel like whether we use xformers should be controlled via the config rather than whether xformers is installed. Is there a reason for doing it this way?

This was a decision made in a Slack discussion, but I'm happy to do it whichever way preferable. I think not having xformers installed at all is the safest choice, because we can't control which 3rd-party code might use it if detected.

I like the runtime decision approach suggested by @psychedelicious. Can easily add that to the PR, but also think we shouldn't be installing it unless necessary.

@github-actions github-actions bot added python PRs that change python files backend PRs that change backend files labels Oct 9, 2024
@ebr
Copy link
Member Author

ebr commented Oct 9, 2024

I added the suggested runtime change in the latest commit

@brandonrising
Copy link
Collaborator

I guess if for some reason a person with a new card wanted to use xformers, they'd have to download it manually 🤔 As much as I am a fan of not installing something that won't be used the vast majority of the time

@ebr
Copy link
Member Author

ebr commented Oct 9, 2024

I guess if for some reason a person with a new card wanted to use xformers, they'd have to download it manually

There's no good reason for someone with a modern GPU to use xformers though, is there?

As much as I'd like to accommodate that individual who swaps their newer (30xx+) GPU for an older (20xx-) GPU and wants to continue seamlessly using Invoke with the same performance profile.... not convinced that's an edge case we really need to account for ;)

@hipsterusername hipsterusername merged commit eda9793 into main Oct 10, 2024
14 checks passed
@hipsterusername hipsterusername deleted the ebr/installer-xformers branch October 10, 2024 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend PRs that change backend files docker installer PRs that change the installer python PRs that change python files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants