Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bf16 does not work with prefix tuning #559

Closed
ozanciga opened this issue Jun 13, 2023 · 0 comments · Fixed by #659
Closed

bf16 does not work with prefix tuning #559

ozanciga opened this issue Jun 13, 2023 · 0 comments · Fixed by #659
Assignees
Labels
bug Something isn't working

Comments

@ozanciga
Copy link

using accelerate[huggingface] and latest version of adapter-transformers. i only worked with the BertAdapterModel so cannot speak to other models' behavior. i think easiest way to test this w/o having to set up training is by casting the model to bf16 via model.bfloat16(), and doing a single forward pass.

when everything is cast into bf16 type, prefix tuning does not properly cast the modified attention_mask variable to the proper dtype.

possible fix (not tested), modify below to include attention_mask's dtype via .to(device=attention_mask.device, dtype=attention_mask.dtype)

https://github.com/adapter-hub/adapter-transformers/blob/accf70fa8b5fb6219c3ac0f8e2f826412425fce6/src/transformers/adapters/prefix_tuning.py#L351

@ozanciga ozanciga added the bug Something isn't working label Jun 13, 2023
@calpt calpt self-assigned this Mar 16, 2024
TimoImhof pushed a commit that referenced this issue Apr 8, 2024
Fixes #559.

The Prefix Tuning implementation has issues with incompatible dtypes
when switching away from the default float32. This PR fixes the issue
caused by the prefix mask and adds test cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants