Gemma2: eager attention by default #32865

gante · 2024-08-17T16:38:02Z

What does this PR do?

See title :)

We know that SDPA yields inferior modeling results, so we should use eager by default. This has been the source of some model quality GH issues, e.g. #32848

Slow tests for gemma 2 ran, no regressions

HuggingFaceDocBuilderDev · 2024-08-17T16:56:38Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts · 2024-08-19T08:52:23Z

tests/models/gemma2/test_modeling_gemma2.py

+        config._attn_implementation = "sdpa"
+        model = Gemma2Model(config)


Here we should check that you can set the attention in the canonical way - rather than overriding the private attribute

Suggested change

config._attn_implementation = "sdpa"

model = Gemma2Model(config)

model = Gemma2Model(config, attn_implementation="sdpa")

@amyeroberts hehe I had the same idea, but this actually doesn't work :D

The API you're suggesting is for .from_pretrained(), not for __init__()

we can pass _attn_implementation I think no?

The API you're suggesting is for .from_pretrained(), not for init()

Oh, true!

we can pass _attn_implementation I think no?

I don't think we can for the model init as it just accepts config as an input arg

ArthurZucker · 2024-08-22T13:32:46Z

tests/models/gemma2/test_modeling_gemma2.py

+        config._attn_implementation = "sdpa"
+        model = Gemma2Model(config)


we can pass _attn_implementation I think no?

gante added 2 commits August 17, 2024 16:14

eager by default

1acec6f

nits

887d8c4

gante requested a review from ArthurZucker August 17, 2024 16:41

amyeroberts reviewed Aug 19, 2024

View reviewed changes

ArthurZucker mentioned this pull request Aug 19, 2024

Add logit scaling sdpa using FlexAttention for Gemma2 #32877

Open

3 tasks

ArthurZucker approved these changes Aug 22, 2024

View reviewed changes

gante merged commit 975b988 into huggingface:main Aug 22, 2024
21 checks passed

gante deleted the gemma_2_eager_default branch August 22, 2024 14:59

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Aug 30, 2024

Gemma2: eager attention by default (huggingface#32865)

8c0f4dc

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Aug 30, 2024

Gemma2: eager attention by default (huggingface#32865)

79ce5c0

ArthurZucker mentioned this pull request Sep 6, 2024

Bug with finetuning Gemma 2 models #33333

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemma2: eager attention by default #32865

Gemma2: eager attention by default #32865

gante commented Aug 17, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Aug 17, 2024

amyeroberts Aug 19, 2024

gante Aug 21, 2024

ArthurZucker Aug 22, 2024

amyeroberts Aug 22, 2024

ArthurZucker Aug 22, 2024

		config._attn_implementation = "sdpa"
		model = Gemma2Model(config)

	config._attn_implementation = "sdpa"
	model = Gemma2Model(config)
	model = Gemma2Model(config, attn_implementation="sdpa")

Gemma2: eager attention by default #32865

Gemma2: eager attention by default #32865

Conversation

gante commented Aug 17, 2024 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Aug 17, 2024

amyeroberts Aug 19, 2024

Choose a reason for hiding this comment

gante Aug 21, 2024

Choose a reason for hiding this comment

ArthurZucker Aug 22, 2024

Choose a reason for hiding this comment

amyeroberts Aug 22, 2024

Choose a reason for hiding this comment

ArthurZucker Aug 22, 2024

Choose a reason for hiding this comment

gante commented Aug 17, 2024 •

edited

Loading