TF_Deberta supporting mixed precision #32618

pinesnow72 · 2024-08-12T09:14:58Z

What does this PR do?

This PR fixes some codes of TF DeBERTa (version 1 and 2) that do not support Mixed Precision.

With TF and Keras, mixed precision is configured with the following settings:

policy = keras.mixed_precision.Policy('mixed_float16')
keras.mixed_precision.set_global_policy(policy)

However, the existing TFDebertaModel and TFDebertaV2Model throws TypeError with this mixed precision setting.
With mixed precision, TF and Keras requires to use Layer.dtype for model or layer's weights and Layer.compute_dtype for internal tensor computation. But the current TFDebertaModel and TFDebertaV2Model codes do not seem to reflect this requirement and definitely assume the dtype would be tf.float32.

I searched and corrected some error-prone code snippets in modeling_tf_deberta.py and modeling_tf_deberta_v2.py.
This correction worked correctly with mixed precision in my local machine.

This PR fixes #31989

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[V] Did you read the contributor guideline,
Pull Request section?
[V] Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@Rocketknight1, @ArthurZucker

Corrected some codes which do not support mixed precision

Rocketknight1 · 2024-08-12T12:49:54Z

Hi @pinesnow72, this looks good! Feel free to ping me whenever it's ready for review. The issues with code quality can be fixed by doing pip install transformers[quality] followed by make style in the repo directory.

pinesnow72 · 2024-08-13T01:41:36Z

Hi @pinesnow72, this looks good! Feel free to ping me whenever it's ready for review. The issues with code quality can be fixed by doing pip install transformers[quality] followed by make style in the repo directory.

Hi @Rocketknight1, As your comment, the style was corrected and uploaded. Thanks.

Rocketknight1 · 2024-08-13T12:01:32Z

LGTM now! cc @amyeroberts for core maintainer review, and thanks for the PR!

amyeroberts

Thanks for opening a PR to handle this!

Just a small question

amyeroberts · 2024-08-13T12:17:08Z

src/transformers/models/deberta/modeling_tf_deberta.py

@@ -701,9 +701,9 @@ def linear(w, b, x):
            ws = tf.split(
                tf.transpose(self.in_proj.weight[0]), num_or_size_splits=self.num_attention_heads * 3, axis=0
            )
-            qkvw = tf.TensorArray(dtype=tf.float32, size=3)
+            qkvw = tf.TensorArray(dtype=self.dtype, size=3)


What's the difference between self.dtype and self.compute_dtype ?

self.dtype is the dtype the layer's weights are stored in, self.compute_dtype is the dtype used for computation. By default these are the same, but in mixed-precision it's common for self.dtype to be float32 while self.compute_dtype is (b)float16

HuggingFaceDocBuilderDev · 2024-08-13T12:22:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts

Thanks for making mixed precision possible for this model!

pinesnow72 · 2024-08-14T00:08:03Z

Thanks for accepting this PR

pinesnow72 added 4 commits August 12, 2024 17:44

Update modeling_tf_deberta.py

74dd24f

Corrected some codes which do not support mixed precision

Update modeling_tf_deberta_v2.py

c51bdc0

Corrected some codes which do not support mixed precision

Update modeling_tf_deberta_v2.py

6f7da44

Update modeling_tf_deberta.py

5acd05e

pinesnow72 added 2 commits August 13, 2024 10:36

Add files via upload

29061dc

Add files via upload

5c01171

Rocketknight1 approved these changes Aug 13, 2024

View reviewed changes

amyeroberts reviewed Aug 13, 2024

View reviewed changes

amyeroberts approved these changes Aug 13, 2024

View reviewed changes

amyeroberts merged commit 9d2ab88 into huggingface:main Aug 13, 2024
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TF_Deberta supporting mixed precision #32618

TF_Deberta supporting mixed precision #32618

pinesnow72 commented Aug 12, 2024

Rocketknight1 commented Aug 12, 2024

pinesnow72 commented Aug 13, 2024

Rocketknight1 commented Aug 13, 2024

amyeroberts left a comment

amyeroberts Aug 13, 2024

Rocketknight1 Aug 13, 2024

HuggingFaceDocBuilderDev commented Aug 13, 2024

amyeroberts left a comment

pinesnow72 commented Aug 14, 2024

TF_Deberta supporting mixed precision #32618

TF_Deberta supporting mixed precision #32618

Conversation

pinesnow72 commented Aug 12, 2024

What does this PR do?

Before submitting

Who can review?

Rocketknight1 commented Aug 12, 2024

pinesnow72 commented Aug 13, 2024

Rocketknight1 commented Aug 13, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts Aug 13, 2024

Choose a reason for hiding this comment

Rocketknight1 Aug 13, 2024

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Aug 13, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

pinesnow72 commented Aug 14, 2024