-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Inference LLM] support static c8 #8833
Conversation
Thanks for your contribution! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8833 +/- ##
===========================================
+ Coverage 55.44% 55.49% +0.05%
===========================================
Files 631 631
Lines 98542 98554 +12
===========================================
+ Hits 54632 54697 +65
+ Misses 43910 43857 -53 ☔ View full report in Codecov by Sentry. |
@@ -435,7 +435,7 @@ def __init__(self, config: LlamaConfig): | |||
ffn1_bias_attrs = None | |||
ffn2_bias_attrs = None | |||
|
|||
if self.quant_type == "a8w8": | |||
if "a8w8" in self.quant_type: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.quant_type如果使用in
进行字符串判断,有没有位置给出quant_type的全部内容,方便后续开发
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这一块没问题的,留有后续可扩展的位置,比如quant_type全部可能的值是a_w_c_,_表示数字
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯嗯,后续文档补充上也可以
self.weight_only_quant_bits = config.weight_only_quant_bits | ||
|
||
if self.quant_type is not None and "weight_only_int" in self.quant_type: | ||
if config.quant_type == "weight_only_int8": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是不是能判断weight_only是否在字符串内,实现判断?命名方式是否有具体文档说明?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已补充~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
26686ad
to
e358ce3
Compare
PR types
New features
PR changes
Models
Description
support static c8