You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
using accelerate[huggingface] and latest version of adapter-transformers. i only worked with the BertAdapterModel so cannot speak to other models' behavior. i think easiest way to test this w/o having to set up training is by casting the model to bf16 via model.bfloat16(), and doing a single forward pass.
when everything is cast into bf16 type, prefix tuning does not properly cast the modified attention_mask variable to the proper dtype.
possible fix (not tested), modify below to include attention_mask's dtype via .to(device=attention_mask.device, dtype=attention_mask.dtype)
Fixes#559.
The Prefix Tuning implementation has issues with incompatible dtypes
when switching away from the default float32. This PR fixes the issue
caused by the prefix mask and adds test cases.
using accelerate[huggingface] and latest version of adapter-transformers. i only worked with the
BertAdapterModel
so cannot speak to other models' behavior. i think easiest way to test this w/o having to set up training is by casting the model to bf16 viamodel.bfloat16()
, and doing a single forward pass.when everything is cast into bf16 type, prefix tuning does not properly cast the modified
attention_mask
variable to the proper dtype.possible fix (not tested), modify below to include attention_mask's dtype via
.to(device=attention_mask.device, dtype=attention_mask.dtype)
https://github.com/adapter-hub/adapter-transformers/blob/accf70fa8b5fb6219c3ac0f8e2f826412425fce6/src/transformers/adapters/prefix_tuning.py#L351
The text was updated successfully, but these errors were encountered: