Skip to content

Commit

Permalink
[Bug Fix] fix allreduce tensor dtype (PaddlePaddle#7876)
Browse files Browse the repository at this point in the history
* [Bug Fix] fix allreduce tensor dtype

Reason: some CCL not support bool dtype

* update int8 to int32
  • Loading branch information
BeingGod authored and xysheng-baidu committed Feb 22, 2024
1 parent 724b524 commit ed42f16
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion paddlenlp/trainer/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -631,7 +631,7 @@ def train(
# The resume_from_checkpoint could be None in some machine node.
# Here we reset None to temp directory.
if args.world_size > 1:
is_resume_from_checkpoint = paddle.to_tensor([resume_from_checkpoint is not None])
is_resume_from_checkpoint = paddle.to_tensor([resume_from_checkpoint is not None], dtype="int32")
paddle.distributed.all_reduce(is_resume_from_checkpoint)
is_resume_from_checkpoint = is_resume_from_checkpoint.item()
if is_resume_from_checkpoint > 0 and is_resume_from_checkpoint < paddle.distributed.get_world_size():
Expand Down

0 comments on commit ed42f16

Please sign in to comment.