bf16 does not work with prefix tuning #559

ozanciga · 2023-06-13T18:28:58Z

using accelerate[huggingface] and latest version of adapter-transformers. i only worked with the BertAdapterModel so cannot speak to other models' behavior. i think easiest way to test this w/o having to set up training is by casting the model to bf16 via model.bfloat16(), and doing a single forward pass.

when everything is cast into bf16 type, prefix tuning does not properly cast the modified attention_mask variable to the proper dtype.

possible fix (not tested), modify below to include attention_mask's dtype via .to(device=attention_mask.device, dtype=attention_mask.dtype)

https://github.com/adapter-hub/adapter-transformers/blob/accf70fa8b5fb6219c3ac0f8e2f826412425fce6/src/transformers/adapters/prefix_tuning.py#L351

The text was updated successfully, but these errors were encountered:

Fixes #559. The Prefix Tuning implementation has issues with incompatible dtypes when switching away from the default float32. This PR fixes the issue caused by the prefix mask and adds test cases.

ozanciga added the bug Something isn't working label Jun 13, 2023

calpt self-assigned this Mar 16, 2024

calpt mentioned this issue Mar 16, 2024

Fix fp16/ bf16 for Prefix Tuning #659

Merged

TimoImhof closed this as completed in #659 Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bf16 does not work with prefix tuning #559

bf16 does not work with prefix tuning #559

ozanciga commented Jun 13, 2023

bf16 does not work with prefix tuning #559

bf16 does not work with prefix tuning #559

Comments

ozanciga commented Jun 13, 2023