Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Training Error with AdapterDrop and Prefix Tuning #673

Merged
merged 3 commits into from
Apr 12, 2024

Conversation

TimoImhof
Copy link
Contributor

Fixes #669

Changes in this PR:

  • Avoid throwing RuntimeError due to dimension mismatch occuring when passing the positional encoding from layers dropped by AdapterDrop to layers modified by prefix tuning.

@TimoImhof TimoImhof self-assigned this Apr 11, 2024
@TimoImhof TimoImhof requested a review from calpt April 11, 2024 13:07
# For Prefix Tuning, when training with AdapterDrop, we must additionally check if the dimensions of the
# position_bias given from previous layers match the dimensions of the scores of the current layer to make
# sure that the position_bias is adequately recomputed if previous layers have been skipped.
if position_bias is None or position_bias.shape != scores.shape:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we specifically check the seqlen dimensions here to make this condition more precise?

- specify the sequence length dimension
- Improve explaining comment
@TimoImhof TimoImhof requested a review from calpt April 12, 2024 10:54
Copy link
Member

@calpt calpt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@calpt calpt merged commit 95cf6bd into adapter-hub:main Apr 12, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

T5 AdapterDrop Prefix-Tuning Bug
2 participants