Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T5 AdapterDrop Prefix-Tuning Bug #669

Closed
FahadEbrahim opened this issue Apr 7, 2024 · 2 comments · Fixed by #673
Closed

T5 AdapterDrop Prefix-Tuning Bug #669

FahadEbrahim opened this issue Apr 7, 2024 · 2 comments · Fixed by #673
Assignees
Labels
bug Something isn't working

Comments

@FahadEbrahim
Copy link
Contributor

FahadEbrahim commented Apr 7, 2024

Hi,

I'm trying to use train an AdapterDrop with T5. It's working fine for bottleneck, LoRA and IA3. But it does not work on Prefix Tuning (Therefore MAM and UniPELT).

Following is a notebook:
https://gist.github.com/FahadEbrahim/66686814f02978da9d4376470356647d

The error is:
RuntimeError: The size of tensor a (90) must match the size of tensor b (80) at non-singleton dimension 3.

@FahadEbrahim FahadEbrahim added the bug Something isn't working label Apr 7, 2024
@FahadEbrahim FahadEbrahim changed the title T5 AdapterDrop Bugs T5 AdapterDrop Prefix-Tuning Bug Apr 8, 2024
@TimoImhof TimoImhof self-assigned this Apr 10, 2024
@TimoImhof
Copy link
Contributor

Hi @FahadEbrahim,

The bug happens only for Prefix Tuning and not the other adapter methods because Prefix Tuning changes the input dimensions when adding the prefixes. When now dropping adapter layers during training with AdapterDrop and Prefix Tuning, the individual transformer layers of T5 can have different dimensions. This leads to the runtime error because the positional encoding in T5 is always forwarded to the next layer, assuming the dimensions will never change.

Thanks for bringing this up! With the new PR, this problem should be solved; your script is running fine with the fix on my machine.

@FahadEbrahim
Copy link
Contributor Author

@TimoImhof Thank you for your quick response and feedback. I tested the new PR branch and it's working perfectly.

With my appreciation,
Fahad.

calpt pushed a commit that referenced this issue Apr 12, 2024
Fixes #669 

Changes in this PR:
- Avoid throwing `RuntimeError` due to dimension mismatch occuring when
passing the positional encoding from layers dropped by AdapterDrop to
layers modified by prefix tuning.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants