You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
I am not sure how to articulate this silent bug with a code snippet. I will try to explain it referring to the code and hopefully it will be clear.
The compute_loss function in the Trainer class gets the model_name from model.base_model when the model is a PeftModel as written here. The base_model of a PeftModel is defined here as one of PEFT_TYPE_TO_MODEL_MAPPING. The 'true' base model is actually stored at base_model.model as declared here.
The issue is--in this line of compute_loss method, the check to shift labels is done by seeing if model_name is inside the MODEL_FOR_CAUSAL_LM_MAPPING_NAMES list. Since the model_name for a PeftModel isn't in that list, the labels aren't shifted.
From what I can tell, a simple fix without breaking anything else could be to modify this line to:
System Info
transformers
version: 4.35.0.dev0Who can help?
@pacman100, @muellerzr
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I am not sure how to articulate this silent bug with a code snippet. I will try to explain it referring to the code and hopefully it will be clear.
The
compute_loss
function in the Trainer class gets themodel_name
frommodel.base_model
when the model is aPeftModel
as written here. Thebase_model
of aPeftModel
is defined here as one ofPEFT_TYPE_TO_MODEL_MAPPING
. The 'true' base model is actually stored atbase_model.model
as declared here.The issue is--in this line of
compute_loss
method, the check to shift labels is done by seeing ifmodel_name
is inside theMODEL_FOR_CAUSAL_LM_MAPPING_NAMES
list. Since themodel_name
for aPeftModel
isn't in that list, the labels aren't shifted.From what I can tell, a simple fix without breaking anything else could be to modify this line to:
Expected behavior
The labels should be shifted for causal language modelling tasks even when using peft models.
The text was updated successfully, but these errors were encountered: