You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
I am trying to fine-tune TFDebertaModel and TFDebertaV2Model for NER task with setting mixed precision
policy = keras.mixed_precision.Policy('mixed_float16')
keras.mixed_precision.set_global_policy(policy)
model = TFDebertaModel.from_pretrained('deberta-base')
# or
# model = TFDebertaV2Model.from_pretrained('deberta-v3-base')
....
model.fit(x=train_data, validation_data=valid_data, epochs=10)
However, when training this model, TypeError was thrown in TFDebertaEmbeddings like the followings:
TypeError: Exception encountered when calling layer 'embeddings' (type TFDebertaEmbeddings).
in user code:
File "/home/swlee/miniconda3/envs/tf216/lib/python3.12/site-packages/transformers/models/deberta/modeling_tf_deberta.py", line 929, in call *
final_embeddings = final_embeddings * mask
TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type float16 of argument 'x'.
The case of TFDebertaV2Model was same with this.
With mixed precision, TF and Keras requires to use Layer.dtype for model or layer's weights and Layer.compute_dtype for internal tensor computation. But the current TFDebertaModel and TFDebertaV2Model codes do not seem to reflect this requirement and definitely assume the dtype would be tf.float32
Expected behavior
I hope that this bug could be fixed soon to support mixed precision.
Actually, I tried to search and correct some error-prone code snippets in modeling_tf_deberta.py and modeling_tf_deberta_v2.py.
Here is the list (but, I am not sure this is exhausted):
As you're trying something with custom code, specifically training with mixed precision, this is a question best placed in our forums. We try to reserve the github issues for feature requests and bug reports.
(But @pinesnow72 your intuition is correct, here if we create the -inf in the mask based on float dtype it is problematic. Feel free to open a PR with your proposed changes, I am certain @Rocketknight1 will be able to review !)
System Info
transformers
version: 4.41.2Who can help?
@ArthurZucker, @Rocketknight1
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I am trying to fine-tune TFDebertaModel and TFDebertaV2Model for NER task with setting mixed precision
However, when training this model, TypeError was thrown in TFDebertaEmbeddings like the followings:
TypeError: Exception encountered when calling layer 'embeddings' (type TFDebertaEmbeddings).
in user code:
File "/home/swlee/miniconda3/envs/tf216/lib/python3.12/site-packages/transformers/models/deberta/modeling_tf_deberta.py", line 929, in call *
final_embeddings = final_embeddings * mask
TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type float16 of argument 'x'.
The case of TFDebertaV2Model was same with this.
With mixed precision, TF and Keras requires to use Layer.dtype for model or layer's weights and Layer.compute_dtype for internal tensor computation. But the current TFDebertaModel and TFDebertaV2Model codes do not seem to reflect this requirement and definitely assume the dtype would be tf.float32
Expected behavior
I hope that this bug could be fixed soon to support mixed precision.
Actually, I tried to search and correct some error-prone code snippets in modeling_tf_deberta.py and modeling_tf_deberta_v2.py.
Here is the list (but, I am not sure this is exhausted):
[in modeling_tf_deberta.py]
(lines: 105, 106)
(correction would be)
(lines: 133, 135, 139)
(correction would be)
(lines: 705, 707)
(correction would be)
(lines: 799)
(correction would be)
(lines: 927)
(correction would be)
[in modeling_tf_deberta_v2.py]
(lines: 106, 107)
(correction would be)
(lines: 135, 137, 141)
(correction would be)
(lines: 391, 404)
(correction would be)
(lines: 770)
(correction would be)
(lines: 853, 867)
(correction would be)
(lines: 1034)
(correction would be)
The text was updated successfully, but these errors were encountered: