Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix arg in bettertransformer llama attention #1421

Merged
merged 3 commits into from
Oct 3, 2023

Conversation

SunMarc
Copy link
Member

@SunMarc SunMarc commented Sep 28, 2023

What does this PR do ?

This PR fixes the integration of llama in bettertransformers. Since the PR about flashattention2 in llama models was merged, a new argument padding_mask was introduced silently in the forward of the attention module. This breaks the forward of the llama_forward in bettertransformer.

@younesbelkada We may have to do it for every model that supports flashattention2. LMK if I should just do one PR for all supported + plan to support models ?

@SunMarc SunMarc changed the title fix arg in llama attention Fix arg in bettertransformer llama attention Sep 28, 2023
Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! I think that you can just pass dummy **kwargs all over the places, what do you think?

@SunMarc
Copy link
Member Author

SunMarc commented Sep 28, 2023

Yeah, we can do that too since FA2 is experimental at this stage and will be removed in the future from transformers. LMK what you think @fxmarty.

@fxmarty
Copy link
Contributor

fxmarty commented Oct 2, 2023

LGTM, thank you for fixing! Yes, feel free to add kwargs everywhere in this PR.

@fxmarty fxmarty merged commit dbe70f9 into huggingface:main Oct 3, 2023
49 of 52 checks passed
@fxmarty fxmarty mentioned this pull request Oct 3, 2023
@fxmarty fxmarty mentioned this pull request Oct 18, 2023
4 tasks
fxmarty pushed a commit that referenced this pull request Nov 3, 2023
* fix arg in llama attention

* change to kwargs

* add kwargs everwhere

---------

Co-authored-by: younesbelkada <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants