Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inference LLM] support static c8 #8833

Merged
merged 8 commits into from
Aug 5, 2024

Conversation

yuanlehome
Copy link
Collaborator

PR types

New features

PR changes

Models

Description

support static c8

Copy link

paddle-bot bot commented Jul 30, 2024

Thanks for your contribution!

Copy link

codecov bot commented Jul 30, 2024

Codecov Report

Attention: Patch coverage is 0% with 61 lines in your changes missing coverage. Please review.

Project coverage is 55.49%. Comparing base (ee4944e) to head (7f157ba).
Report is 243 commits behind head on develop.

Files with missing lines Patch % Lines
...dlenlp/experimental/transformers/llama/modeling.py 0.00% 18 Missing ⚠️
...erimental/transformers/fused_transformer_layers.py 0.00% 10 Missing ⚠️
...ddlenlp/experimental/transformers/qwen/modeling.py 0.00% 10 Missing ⚠️
...dlenlp/experimental/transformers/bloom/modeling.py 0.00% 5 Missing ⚠️
...enlp/experimental/transformers/chatglm/modeling.py 0.00% 5 Missing ⚠️
...p/experimental/transformers/chatglm_v2/modeling.py 0.00% 5 Missing ⚠️
...addlenlp/experimental/transformers/gpt/modeling.py 0.00% 5 Missing ⚠️
...enlp/experimental/transformers/generation_utils.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #8833      +/-   ##
===========================================
+ Coverage    55.44%   55.49%   +0.05%     
===========================================
  Files          631      631              
  Lines        98542    98554      +12     
===========================================
+ Hits         54632    54697      +65     
+ Misses       43910    43857      -53     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -435,7 +435,7 @@ def __init__(self, config: LlamaConfig):
ffn1_bias_attrs = None
ffn2_bias_attrs = None

if self.quant_type == "a8w8":
if "a8w8" in self.quant_type:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.quant_type如果使用in进行字符串判断,有没有位置给出quant_type的全部内容,方便后续开发

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这一块没问题的,留有后续可扩展的位置,比如quant_type全部可能的值是a_w_c_,_表示数字

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯嗯,后续文档补充上也可以

self.weight_only_quant_bits = config.weight_only_quant_bits

if self.quant_type is not None and "weight_only_int" in self.quant_type:
if config.quant_type == "weight_only_int8":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是不是能判断weight_only是否在字符串内,实现判断?命名方式是否有具体文档说明?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已补充~

DrownFish19
DrownFish19 previously approved these changes Jul 30, 2024
Copy link
Collaborator

@DrownFish19 DrownFish19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wawltor wawltor merged commit a6a7870 into PaddlePaddle:develop Aug 5, 2024
9 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants