Add twenty redundant data in post pretrain #8777

JunnYu · 2024-07-18T04:13:12Z

PR types

BUG

PR changes

APIs

Description

同https://github.com/PaddlePaddle/PaddleNLP/pull/8776。
多加点样本，加20条吧，担心10的情况还会有超过索引。

paddle-bot · 2024-07-18T04:13:17Z

Thanks for your contribution!

* 给dataset再添加20条数据,防止blend dataset出现错误

* quick fix from pretrained. (#8487) * quick fix os.path.split (#8508) * Cp/fix (#8569) * [Safetensors] Fix fast safe open slice. (#8512) * [FIX DDP] fix ddp (#8549) * [BUG] Fix build train valid test datasets (#8823) * Update causal_dataset.py * Add twenty redundant data in post pretrain (#8777) * 给dataset再添加20条数据,防止blend dataset出现错误 * num_samples向下去整,防止数据集的溢出 (#8691) * update release_grads (#8834) * update release_grads (#8834) * [Trainer] Fix release_grads (#9085) * fix pp release_grads * add dataloader_drop_last to evaldataloader (#8773) * bugfix * Fix eval hang (#9052) * fix pipeline eval * fix eval dataloader_num_workers --------- Co-authored-by: Zhong Hui <[email protected]> Co-authored-by: yujun <[email protected]> Co-authored-by: gongel <[email protected]>

JunnYu added 2 commits July 18, 2024 11:48

给dataset再添加10条数据,防止blend dataset出现错误

1c3ec97

给dataset再添加20条数据,防止blend dataset出现错误

6d1c8b3

Merge branch 'release/2.8' into add_ten_redundant_data_in_post_pretrain

97b8c3e

ZHUI approved these changes Jul 18, 2024

View reviewed changes

JunnYu merged commit 157f7d3 into PaddlePaddle:release/2.8 Jul 18, 2024
4 of 5 checks passed

DesmonDay pushed a commit to DesmonDay/PaddleNLP that referenced this pull request Sep 5, 2024

Add twenty redundant data in post pretrain (PaddlePaddle#8777)

6e24524

* 给dataset再添加20条数据,防止blend dataset出现错误

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add twenty redundant data in post pretrain #8777

Add twenty redundant data in post pretrain #8777

JunnYu commented Jul 18, 2024 •

edited

Loading

paddle-bot bot commented Jul 18, 2024

Add twenty redundant data in post pretrain #8777

Add twenty redundant data in post pretrain #8777

Conversation

JunnYu commented Jul 18, 2024 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Jul 18, 2024

JunnYu commented Jul 18, 2024 •

edited

Loading