You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to use train an AdapterDrop with T5. It's working fine for bottleneck, LoRA and IA3. But it does not work on Prefix Tuning (Therefore MAM and UniPELT).
The bug happens only for Prefix Tuning and not the other adapter methods because Prefix Tuning changes the input dimensions when adding the prefixes. When now dropping adapter layers during training with AdapterDrop and Prefix Tuning, the individual transformer layers of T5 can have different dimensions. This leads to the runtime error because the positional encoding in T5 is always forwarded to the next layer, assuming the dimensions will never change.
Thanks for bringing this up! With the new PR, this problem should be solved; your script is running fine with the fix on my machine.
Fixes#669
Changes in this PR:
- Avoid throwing `RuntimeError` due to dimension mismatch occuring when
passing the positional encoding from layers dropped by AdapterDrop to
layers modified by prefix tuning.
Hi,
I'm trying to use train an AdapterDrop with T5. It's working fine for bottleneck, LoRA and IA3. But it does not work on Prefix Tuning (Therefore MAM and UniPELT).
Following is a notebook:
https://gist.github.com/FahadEbrahim/66686814f02978da9d4376470356647d
The error is:
RuntimeError: The size of tensor a (90) must match the size of tensor b (80) at non-singleton dimension 3.
The text was updated successfully, but these errors were encountered: