[GPTQ Enhence] Add GPTQ int8 weight unpack function #184

Zhenzhong1 · 2024-03-21T07:40:43Z

Type of Change

Feature Enhence

Description

Add int8 unpack weight to support int8 GPTQ

Expected Behavior & Potential Risk

N/A

How has this PR been tested?

Manually tests.

Dependency Change?

N/A

neural_speed/convert/convert_quantized_llama.py

neural_speed/convert/common.py

a32543254

LGTM

Zhenzhong1 added 2 commits March 19, 2024 02:06

support gptq int8

cc5124c

update common.py

fde9724

Zhenzhong1 marked this pull request as draft March 21, 2024 07:41

Zhenzhong1 marked this pull request as ready for review March 21, 2024 07:56

Zhenzhong1 marked this pull request as draft March 21, 2024 07:56

Zhenzhong1 marked this pull request as ready for review March 22, 2024 01:34

Zhenzhong1 requested a review from a32543254 March 22, 2024 01:35

Zhenzhong1 changed the title ~~[GPTQ Enhence] Add int8 unpack weight to support int8 GPTQ~~ [GPTQ Enhence] Add unpack int8 weight for GPTQ Mar 22, 2024

a32543254 reviewed Mar 22, 2024

View reviewed changes

neural_speed/convert/convert_quantized_llama.py Outdated Show resolved Hide resolved

neural_speed/convert/common.py Outdated Show resolved Hide resolved

neural_speed/convert/common.py Show resolved Hide resolved

a32543254 approved these changes Mar 22, 2024

View reviewed changes

changwangss approved these changes Mar 22, 2024

View reviewed changes

Zhenzhong1 and others added 4 commits March 22, 2024 10:44

Update common.py

0ec501b

Update convert_quantized_llama.py

b07a843

Merge branch 'main' into zhenzhong/gptq-int8

b091d75

remove pdb

9e73a0c

VincyZhang merged commit ed6e8ad into main Mar 22, 2024
11 checks passed

Zhenzhong1 changed the title ~~[GPTQ Enhence] Add unpack int8 weight for GPTQ~~ [GPTQ Enhence] Add GPTQ int8 weight unpack function Mar 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPTQ Enhence] Add GPTQ int8 weight unpack function #184

[GPTQ Enhence] Add GPTQ int8 weight unpack function #184

Zhenzhong1 commented Mar 21, 2024

a32543254 left a comment

[GPTQ Enhence] Add GPTQ int8 weight unpack function #184

[GPTQ Enhence] Add GPTQ int8 weight unpack function #184

Conversation

Zhenzhong1 commented Mar 21, 2024

Type of Change

Description

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

a32543254 left a comment

Choose a reason for hiding this comment