Finetune with deepspeed: type mismatch #35

YeZiyi1998 · 2024-06-07T09:00:45Z

I encountered an issue while finetune with the officially released code using the DeepSpeed. Here is the detailed error message:

File "/lib/python3.11/site-packages/deepspeed/runtime/zero/linear.py", line 57, in forward
output = input.matmul(weight.t())
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::BFloat16

It appears that the matmul operation expects the two input tensors to have the same dtype. However, in my case, one of the tensors is of dtype float and the other is of dtype BFloat16.

I am not sure if this is a bug in the DeepSpeed library or an issue with my usage. I would appreciate any assistance in resolving this issue.

The text was updated successfully, but these errors were encountered:

lihaoling · 2024-06-18T08:07:19Z

same question

JensenDong · 2024-06-25T02:24:42Z

same + 1

yiyepiaoling0715 · 2024-07-23T02:08:55Z

I encountered the same problem, and here's how I solved it. modify lines 425 and 428 in the modelling_deepseek.py file and remove torch.float32, such as the following code

        logits = F.linear(
            hidden_states, self.weight, None
        )
      if self.scoring_func == "softmax":
            scores = logits.softmax(dim=-1)

I encountered an issue while finetune with the officially released code using the DeepSpeed. Here is the detailed error message:
File "/lib/python3.11/site-packages/deepspeed/runtime/zero/linear.py", line 57, in forward
output = input.matmul(weight.t())
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::BFloat16
It appears that the matmul operation expects the two input tensors to have the same dtype. However, in my case, one of the tensors is of dtype float and the other is of dtype BFloat16.

I am not sure if this is a bug in the DeepSpeed library or an issue with my usage. I would appreciate any assistance in resolving this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetune with deepspeed: type mismatch #35

Finetune with deepspeed: type mismatch #35

YeZiyi1998 commented Jun 7, 2024

lihaoling commented Jun 18, 2024

JensenDong commented Jun 25, 2024

yiyepiaoling0715 commented Jul 23, 2024

Finetune with deepspeed: type mismatch #35

Finetune with deepspeed: type mismatch #35

Comments

YeZiyi1998 commented Jun 7, 2024

lihaoling commented Jun 18, 2024

JensenDong commented Jun 25, 2024

yiyepiaoling0715 commented Jul 23, 2024