The loss of sharegpt format #5828

ZijunSong · 2024-10-25T10:58:48Z

Reminder

I have read the README and searched the existing issues.

System Info

llamafactory version: 0.9.0
Platform: Linux-3.10.0-1127.el7.x86_64-x86_64-with-glibc2.31
Python version: 3.11.10
PyTorch version: 2.0.1+cu118 (GPU)
Transformers version: 4.45.0
Datasets version: 2.21.0
Accelerate version: 0.33.0
PEFT version: 0.12.0
TRL version: 0.9.6
GPU type: NVIDIA A100-SXM4-40G

Reproduction

我注意到sharegpt格式的数据在训练计算loss时，奇数位当作input不做训练（mask），偶数位认为是llm要生成的结果，需要逐token算loss，还有几个问题想详细了解下。
输入输出是分别把奇偶数位的内容concat吗，因而只能看见（输入）所有位置的内容，训练（输出）所有位置的内容，而不能选择只训练最后一个偶数位内容。

Expected behavior

No response

Others

No response

The text was updated successfully, but these errors were encountered:

github-actions bot added the pending This problem is yet to be addressed label Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The loss of sharegpt format #5828

The loss of sharegpt format #5828

ZijunSong commented Oct 25, 2024

The loss of sharegpt format #5828

The loss of sharegpt format #5828

Comments

ZijunSong commented Oct 25, 2024

Reminder

System Info

Reproduction

Expected behavior

Others