[taskflow]SentenceFeatureExtractionTask #1

qingzhong1 · 2023-07-03T11:32:46Z

PR types

PR changes

Description

Improve prompt; fix <br> bug

* Fix the issue of P-tuning official sample error * Create codecov workflow.yml * Delete .github/workflows/workflow.yml

* modify api for pir * modify api for pir * pass none for while * modify ci test

* update * update * remove

Co-authored-by: winter-wang <[email protected]>

* Update README.md * Update README_en.md

* stage 1 * update * update * fix ci * fix ci * fix ut

* modify tensorboard requirements * modify tensorboard requirements * modify tensorboard requirement

* stage 1 * update * update * support qwen2 bf16/wint8 * add qwen2 ptq map * update * fix tune_cublaslt_gemm.cu

* modify dict include none * recover

* add yuan model * add yuan model settings * fix conflict * add readme * update format * update readme * update readme * update for lint * update for lint * update for lint * update for lint * add fp16 * update utils * fix bug * format * correct pre-commit * correct fp16 * delete rearrange * add pre_train * auto convert from torch * support sft lora * update structure * fix yuantokenizer pad * update scripts * update readme * update readme * add pad_token_id * format * format * update readme * update for review * update for review * fix bug * update to CRLF * format * support param convert * update sft config * fix modeling * fix qk fuse&spilt and fix fa * format * format * format * fix fa dtype --------- Co-authored-by: drownfish19 <[email protected]>

* add yuan2 model list * update

* set logging_step to 5 with baichuan && qwen benchmark

* support llama3.1 * update

* fix load best * update

* fix bug * update

* add precision alignment doc

* update qwen tokenizer * add test case

* bugfixed

* Add ci for A100 and V100 device * Fix device judge * fix compare of float var * Add output of test failed * Fix cal fail test count * Remove compare_float fun with `!=` to compare * Fix output of fail tests * Run gpt-3_dygraph when related files modified * add track status( fail tests ) for every case * Add track status of loss verification failed test * Move output position of every case * modify `exit` to `return 0`

* add fp8 gen files to gitignore * append_attn support fp8 quant * Unified FP8 Network * include cuda_fp8.h * simplify qwen2 network and FusedBlockMultiTransformerFP8 * simplify llama network and code check * check fp8 params * code check * check * default config for fp8 gemm

) * Add , support register tokenizer, fix some typo * add more tests * CustomTokenizerFast2->CustomTokenizerFastWithoutSlow * lint

* support optimizer state offload (#8715) * support optimizer offload * update doc * [FleetY]offload optimizer state after load optmizer state (#9352) * add offload optimizer * fix memory * [FleetY] Add reload/offload for optimizer (#9356) * add reload/offload for optimizer * fix bug --------- Co-authored-by: Guoxia Wang <[email protected]>

* mv assert to warning

* optimize fuse some kernels * optimize fuse some kernels * fix top_p reject * fix * ci * fix review * fix

* Fix exitcode bug * Fix `track_case_status` func match bug * Fix return code * Fix print_info func with exit -6 * set output format of fail tests modify verification check failed

Co-authored-by: 周天宇 <[email protected]>

* fix empty state_dict * update sharding split_parma

* add submodule * add submodule

* fix ci scripts * [CI]add recursive for submodule

* fix ci scripts * [CI]add recursive for submodule * [CI]fix

* add ktotrainer * fix oom

Co-authored-by: root <[email protected]>

* update target_lists_for_llm * add scripts/regression

* add the AutoModel for inference mode in dynamic graph * add the AutoModel for inference mode in static graph * create AutoInferenceModelForCausalLM class and polish the code * fix the return result * add the confirm_inference_model method * roback the AutoModel * modify the description * update

qingzhong1 pushed a commit that referenced this pull request Sep 18, 2023

Merge pull request #1 from sijunhe/pip32

b622876

Improve prompt; fix <br> bug

deepllz and others added 29 commits August 6, 2024 17:24

Support nested list of dict inputs (#8876)

4cf0806

Fix the bug with issues code 8641. (#8880)

6235fcd

Fix the issue of P-tuning official sample error (#8884)

678843e

* Fix the issue of P-tuning official sample error * Create codecov workflow.yml * Delete .github/workflows/workflow.yml

modify Paddlemix qwen dytostatic (#8869)

5c57015

* modify api for pir * modify api for pir * pass none for while * modify ci test

fix zeropadding (#8895)

2491f5d

Fix fast_ln InferDtypeFn (#8891)

fd692a8

[Fix] enable_sp_async_reduce_scatter for qwen_72b && llama2_70b (#8897)

6f5bb76

Update run_pretrain.py (#8902)

996198a

[doc] Update readme (#8905)

4860d63

* update * update * remove

bugfix auto parallel FA (#8903)

c5e8109

Co-authored-by: winter-wang <[email protected]>

[Readme] Update README.md (#8908)

4ebec1d

* Update README.md * Update README_en.md

fix format (#8878)

7187299

[LLM Inference] Refactor BlockInferencePredictor (#8879)

5bc040a

* stage 1 * update * update * fix ci * fix ci * fix ut

【Fix】modify tensorboard requirements (#8904)

6f3e736

* modify tensorboard requirements * modify tensorboard requirements * modify tensorboard requirement

[LLM Inference] Support qwen2 (#8893)

0a5de12

* stage 1 * update * update * support qwen2 bf16/wint8 * add qwen2 ptq map * update * fix tune_cublaslt_gemm.cu

modify dict include none to aviod pir dytostatic bug in while op (#8898)

57a42f8

* modify dict include none * recover

Update qwen && baichuan benchmark config (#8920)

f069a53

[doc] Update README (#8922)

0cc8554

* add yuan2 model list * update

support json parameter (#8446)

85fbe52

[TIPC] Set logging_step to 5 with baichuan && qwen benchmark (#8928)

75c7636

* set logging_step to 5 with baichuan && qwen benchmark

[Cherry-pick] Fix pipeline eval (#8924)

01bb1f8

fix test_wint8 ut (#8930)

82ea8bc

[LLM Inference] support llama3.1 (#8929)

3fff378

* support llama3.1 * update

update tokens compute (#8938)

8f6b001

[BugFix] fix create_optimizer_and_scheduler for auto_parallel (#8937)

9b76eb2

[BugFix] fix predict get_tensor_parallel_mappings (#8939)

c88a971

[Unified Checkpoint] Fix load best checkpoint (#8935)

310a7bc

* fix load best * update

Fix predictor bug (#8947)

6d538c7

* fix bug * update

DrownFish19 and others added 30 commits October 31, 2024 15:38

Compatible with Tensor.to change to out_of_place. (#9343)

483fff4

[Tokenizer] Fix Llama3Tokenizer import (#9341)

8279b14

[Docs] Add precision alignment doc (#9346)

149cfa8

* add precision alignment doc

[Tokenizer] Support adding special tokens to Qwen tokenizer (#9344)

71dafa6

* update qwen tokenizer * add test case

Add ordered save to avoid OOM (#9347)

ba5c2ca

[AutoParallel] Hang Bugfix for VPP-Sharding (#9336)

66c5d65

* bugfixed

[Tokenizer] Add BertTokenizerFast, support register new tokenizer (#9353

cd22b0d

) * Add , support register tokenizer, fix some typo * add more tests * CustomTokenizerFast2->CustomTokenizerFastWithoutSlow * lint

clean print in auto_trainer (#9357)

1aa91be

[Unified Checkpoint] Fix fp32 dtype for using newest paddle(#9360)

e2cc3d5

fix (#9363)

55a1bcb

refine dtype use (#9366)

ab62e47

Add check for sharding stage1-v2 using amp master grad (#9333)

9d658a9

[Trainer] Update assert to warning (#9332)

85333aa

* mv assert to warning

[Auto Parallel] fix adapt_stale_fwd_patch for to_static mode (#9372)

19a2e1f

[LLM INFER] Optimize fuse some kernels in postprocess (#9201)

0977858

* optimize fuse some kernels * optimize fuse some kernels * fix top_p reject * fix * ci * fix review * fix

[AutoParallel] Fix EXCODE bug of AutoParallel CI (#9355)

3971fc7

* Fix exitcode bug * Fix `track_case_status` func match bug * Fix return code * Fix print_info func with exit -6 * set output format of fail tests modify verification check failed

[Llama] Support pp + no_recompute_layer. (#9373)

2f0b407

Co-authored-by: 周天宇 <[email protected]>

[Unified Checkpoint] Support empty state_dict saving (#9380)

d526be2

* fix empty state_dict * update sharding split_parma

Add submodule (#9385)

140ea48

* add submodule * add submodule

[CI] add recursive for submodule (#9389)

eca4da4

* fix ci scripts * [CI]add recursive for submodule

[CI]fix scripts (#9394)

5c6279a

* fix ci scripts * [CI]add recursive for submodule * [CI]fix

[LLM]add ktotrainer (#9393)

b5e3f0c

* add ktotrainer * fix oom

[AutoParallel] refine log (#9397)

10a62c7

[XPU] llama swiglu uses Paddle's native swiglu (#9414)

2838e80

fix hip paddle_nlp_ops bug (#9418)

09d2c14

Co-authored-by: root <[email protected]>

[CI]update target_lists_for_llm (#9417)

c9cfa99

* update target_lists_for_llm * add scripts/regression

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[taskflow]SentenceFeatureExtractionTask #1

[taskflow]SentenceFeatureExtractionTask #1

qingzhong1 commented Jul 3, 2023

[taskflow]SentenceFeatureExtractionTask #1

Are you sure you want to change the base?

[taskflow]SentenceFeatureExtractionTask #1

Conversation

qingzhong1 commented Jul 3, 2023

PR types

PR changes

Description