Trainer doesn't shift labels for CAUSAL_LM PEFT models with label smoothing enabled #27161

kkteru · 2023-10-30T22:25:44Z

System Info

transformers version: 4.35.0.dev0
Platform: Linux-5.15.90.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python version: 3.10.12
Huggingface_hub version: 0.16.4
Safetensors version: 0.3.2
Accelerate version: 0.24.1
Accelerate config: not found
PyTorch version (GPU?): 2.0.1+cu118 (True)
Tensorflow version (GPU?): 2.14.0 (True)
Flax version (CPU?/GPU?/TPU?): 0.7.0 (cpu)
Jax version: 0.4.13
JaxLib version: 0.4.13
Using GPU in script?: RTX 3090
Using distributed or parallel set-up in script?: No

Who can help?

@pacman100, @muellerzr

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

I am not sure how to articulate this silent bug with a code snippet. I will try to explain it referring to the code and hopefully it will be clear.

The compute_loss function in the Trainer class gets the model_name from model.base_model when the model is a PeftModel as written here. The base_model of a PeftModel is defined here as one of PEFT_TYPE_TO_MODEL_MAPPING. The 'true' base model is actually stored at base_model.model as declared here.

The issue is--in this line of compute_loss method, the check to shift labels is done by seeing if model_name is inside the MODEL_FOR_CAUSAL_LM_MAPPING_NAMES list. Since the model_name for a PeftModel isn't in that list, the labels aren't shifted.

From what I can tell, a simple fix without breaking anything else could be to modify this line to:

model_name = unwrap_model(model.base_model.model)._get_name()

Expected behavior

The labels should be shifted for causal language modelling tasks even when using peft models.

The text was updated successfully, but these errors were encountered:

kkteru · 2023-10-30T22:29:26Z

Created a PR for this quick fix, if that helps.

amyeroberts · 2023-10-31T09:31:18Z

cc @younesbelkada

younesbelkada · 2023-10-31T13:16:36Z

Thanks for the deepdive! I will reply you on the PR itself

kkteru mentioned this issue Oct 30, 2023

Fixed base model class name extraction from PeftModels #27162

Merged

5 tasks

amyeroberts closed this as completed in #27162 Nov 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trainer doesn't shift labels for CAUSAL_LM PEFT models with label smoothing enabled #27161

Trainer doesn't shift labels for CAUSAL_LM PEFT models with label smoothing enabled #27161

kkteru commented Oct 30, 2023 •

edited

Loading

kkteru commented Oct 30, 2023

amyeroberts commented Oct 31, 2023

younesbelkada commented Oct 31, 2023

Trainer doesn't shift labels for CAUSAL_LM PEFT models with label smoothing enabled #27161

Trainer doesn't shift labels for CAUSAL_LM PEFT models with label smoothing enabled #27161

Comments

kkteru commented Oct 30, 2023 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

kkteru commented Oct 30, 2023

amyeroberts commented Oct 31, 2023

younesbelkada commented Oct 31, 2023

kkteru commented Oct 30, 2023 •

edited

Loading