You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File /expscratch/eyang/workspace/adapter/adapter-transformers/src/transformers/adapters/model_mixin.py:949, in ModelWithHeadsAdaptersMixin.train_adapter(self, adapter_setup, train_embeddings)
947 super().train_adapter(adapter_setup, train_embeddings)
948 else:
--> 949 self.base_model.train_adapter(adapter_setup, train_embeddings)
File /expscratch/eyang/workspace/adapter/adapter-transformers/src/transformers/adapters/model_mixin.py:287, in ModelAdaptersMixin.train_adapter(self, adapter_setup, train_embeddings)
285 """Sets the model into mode for training the given adapters."""
286 self.train()
--> 287 self.freeze_model(True)
288 adapter_setup = parse_composition(adapter_setup)
289 self.apply_to_adapter_layers(lambda i, layer: layer.enable_adapters(adapter_setup, True, False))
File /expscratch/eyang/workspace/adapter/adapter-transformers/src/transformers/adapters/model_mixin.py:726, in ModelAdaptersMixin.freeze_model(self, freeze)
724 # first freeze/ unfreeze all model weights
725 for param in self.base_model.parameters():
--> 726 param.requires_grad = not freeze
727 self.model_frozen = freeze
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
I used the following code to see which specific parameters is causing the issue
embeddings.word_embeddings.weight
you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
So the referencing seems to be the part that breaks the computational graph. (which I'm not sure why as the parameters were cloned when adding a new embeddings...)
- Introduces a new `EmbeddingAdaptersWrapperMixin` to make embedding methods available to heads model classes. This is implemented in new per-model heads mixins. Closes#382.
- Fixes size issues with embeddings. Closes#383.
- Detach embedding weights before cloning. Closes#384.
Environment info
adapter-transformers
version: v3.0.1+ (commit 11bd9d2)Information
Model I am using (Bert, XLNet ...): XLMR
Language I am using the model on (English, Chinese ...):
Adapter setup I am using (if any):
The problem arises when using:
The tasks I am working on is:
To reproduce
This gives me the following error message.
I used the following code to see which specific parameters is causing the issue
and got
Interestingly, the following code is fine.
So the referencing seems to be the part that breaks the computational graph. (which I'm not sure why as the parameters were cloned when adding a new embeddings...)
The following code is also fine
Expected behavior
Should be able to train stuff...
The text was updated successfully, but these errors were encountered: