Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model loading speed optimization #5635

Merged

Conversation

RyanJDick
Copy link
Contributor

What does this PR do?

This PR moves an unchanging operation out of a loop for a speed benefit during model loading. It does not change the functional behaviour in any way.

Explanation

The following code was used to profile model loading:

import cProfile
from diffusers import UNet2DConditionModel

def main():
    with cProfile.Profile() as pr:
        unet = UNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5", subfolder="unet")
        pr.dump_stats("unet_load.prof")

if __name__ == "__main__":
    main()

Before:
The resultant .prof file can be visualized with snakeviz. This revealed that a significant amount of time was being spent on redundant calls to inspect.signature(....):

before_optimization

After:
After the improvement in this PR, the time spent on calls to inspect.signature(....) was reduced from 0.0340s to 0.0001s:

after_optimization

Aside

You may be wondering: "Why does this guy care about such a small slice of the flamegraph?".
A: I've already optimized many of the slower model loading steps (via torch monkey-patches), to the point where this is non-negligible. (And, it seemed like an easy fix.)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
  • Did you read our philosophy doc (important for complex PRs)?
  • Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests? No new tests necessary - there is no change to the expected behaviour.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Nov 3, 2023

looking great!
thank you for keep squeezing out more performance for us:)

cc @patrickvonplaten @sayakpaul

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouf! Thank YOU!

@patrickvonplaten
Copy link
Contributor

Very nice!

@patrickvonplaten patrickvonplaten merged commit 7ad70ce into huggingface:main Nov 3, 2023
11 checks passed
kashif pushed a commit to kashif/diffusers that referenced this pull request Nov 11, 2023
Move unchanging operation out of loop for speed benefit.
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
Move unchanging operation out of loop for speed benefit.
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
Move unchanging operation out of loop for speed benefit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants