Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INFER][LLM] Support qwen in fined grained dybatch v1 #7644

Merged
merged 23 commits into from
Jan 9, 2024

Conversation

DanGuge
Copy link
Contributor

@DanGuge DanGuge commented Dec 13, 2023

PR types

PR changes

Description

  • 可以通过以下命令执行动态图推理
python predictor.py \
    --model_name_or_path "qwen/qwen-7b" \
    --inference_model \
    --dtype "float16"

Copy link

paddle-bot bot commented Dec 13, 2023

Thanks for your contribution!

Copy link

codecov bot commented Dec 13, 2023

Codecov Report

Attention: 213 lines in your changes are missing coverage. Please review.

Comparison is base (b02c716) 57.48% compared to head (4d478e8) 57.12%.
Report is 56 commits behind head on develop.

Files Patch % Lines
...ddlenlp/experimental/transformers/qwen/modeling.py 0.00% 211 Missing ⚠️
paddlenlp/experimental/transformers/__init__.py 0.00% 1 Missing ⚠️
...ddlenlp/experimental/transformers/qwen/__init__.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #7644      +/-   ##
===========================================
- Coverage    57.48%   57.12%   -0.37%     
===========================================
  Files          583      587       +4     
  Lines        87187    88190    +1003     
===========================================
+ Hits         50123    50376     +253     
- Misses       37064    37814     +750     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wj-Mcat
Copy link
Contributor

wj-Mcat commented Dec 28, 2023

从 PaddleNLP-CI 的日志中看来是精度没有对齐

@DanGuge
Copy link
Contributor Author

DanGuge commented Dec 28, 2023

从 PaddleNLP-CI 的日志中看来是精度没有对齐

@ZHUI ZHUI changed the title [llm]support qwen in fined grained dybatch v1 [INFER][LLM] Support qwen in fined grained dybatch v1 Jan 2, 2024
@ZHUI ZHUI added the inference label Jan 2, 2024
@DanGuge
Copy link
Contributor Author

DanGuge commented Jan 6, 2024

  • qwen的非inference_model实现,在多batch推理时精度存在问题
  • 运行目录:PaddleNLP/llm

qwen 推理

  • batch_size = 1
#!/bin/bash

python3 predictor.py \
    --model_name_or_path qwen/qwen-7b \
    --decode_strategy greedy_search \
    --batch_size 1 \
    --dtype float16

fine_1_greed

  • batch_size = 2
#!/bin/bash

python3 predictor.py \
    --model_name_or_path qwen/qwen-7b \
    --decode_strategy greedy_search \
    --batch_size 2 \
    --dtype float16

fine_2_greed

qwen inference model 推理

  • batch_size = 1
#!/bin/bash

python3 predictor.py \
    --model_name_or_path qwen/qwen-7b \
    --decode_strategy greedy_search \
    --batch_size 1 \
    --inference_model \
    --dtype float16

fused_1_greed

  • batch_size = 2
#!/bin/bash

python3 predictor.py \
    --model_name_or_path qwen/qwen-7b \
    --decode_strategy greedy_search \
    --batch_size 2 \
    --inference_model \
    --dtype float16

fused_2_greed

cc @wj-Mcat

Copy link
Contributor

@wj-Mcat wj-Mcat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@wawltor wawltor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wawltor wawltor merged commit 37b4fe0 into PaddlePaddle:develop Jan 9, 2024
7 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants