Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: ToMe token merging optimization, or just less tokens #18

Open
torridgristle opened this issue Jul 21, 2023 · 0 comments
Open

Comments

@torridgristle
Copy link

https://github.com/dbolya/tomesd ToMe SD has support for Diffusers so using it should be as simple as:

import tomesd
tomesd.apply_patch(model, ratio=0.5)

Alternatively, since the prompts are typically very short, the text encoder's output could be cropped down to the first 16 or so tokens, or some other user adjustable number. Like cond[:,:8] uncond[:,:8]

All the <|endoftext|> padding at the end doesn't really matter much, and isn't going to match with anything all that strongly in the cross attention layers. Having some <|endoftext|> padding is still useful since it will carry some meaning from previous words in the prompt, but padding it out to 75 or however many tokens isn't beneficial.

If the method of just cropping down the token count is used, I believe that it should verify that it's long enough to contain all of the prompts with their <|startoftext|> and at least one <|endoftext|>.

@torridgristle torridgristle changed the title ToMe token merging optimization, or just less tokens Request: ToMe token merging optimization, or just less tokens Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant