New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[feature] add gptq for inference #4754

Merged

Xu-Kai merged 14 commits into hpcaitech:main from Xu-Kai:rebase_gptq_to_main

Sep 22, 2023

Commits on Sep 19, 2023

[gptq] add gptq kernel (hpcaitech#4416 )

* add gptq

* refactor code

* fix tests

* replace auto-gptq

* rname inferance/quant

* refactor test

* add auto-gptq as an option

* reset requirements

* change assert and check auto-gptq

* add import warnings

* change test flash attn version

* remove example

* change requirements of flash_attn

* modify tests

* [skip ci] change requirements-test

Xu-Kai committed Sep 19, 2023

08b928b

[gptq] faster gptq cuda kernel (hpcaitech#4494 )
```
* [skip ci] add cuda kernels

* add license

* [skip ci] fix max_input_len

* format files & change test size

* [skip ci]
```
Xu-Kai committed Sep 19, 2023
Configuration menu
View commit details

Copy full SHA for 5bd381d

Browse repository at this point
Copy the full SHA

5bd381d View commit details

Browse the repository at this point in the history
[gptq] add gptq tensor parallel (hpcaitech#4538 )
```
* add gptq tensor parallel

* add gptq tp

* delete print

* add test gptq check

* add test auto gptq check
```
Xu-Kai committed Sep 19, 2023
Configuration menu
View commit details

Copy full SHA for 145ff94

Browse repository at this point
Copy the full SHA

145ff94 View commit details

Browse the repository at this point in the history
[gptq] combine gptq and kv cache manager (hpcaitech#4706 )
```
* combine gptq and kv cache manager

* add init bits

* delete useless code

* add model path

* delete usless print and update test

* delete usless import

* move option gptq to shard config
```
Xu-Kai committed Sep 19, 2023
Configuration menu
View commit details

Copy full SHA for aefe767

Browse repository at this point
Copy the full SHA

aefe767 View commit details

Browse the repository at this point in the history
change replace linear to shardformer

Xu-Kai committed Sep 19, 2023
Configuration menu
View commit details

Copy full SHA for 27b48b3

Browse repository at this point
Copy the full SHA

27b48b3 View commit details

Browse the repository at this point in the history
update bloom policy

Xu-Kai committed Sep 19, 2023
Configuration menu
View commit details

Copy full SHA for d896733

Browse repository at this point
Copy the full SHA

d896733 View commit details

Browse the repository at this point in the history
delete useless code

Xu-Kai committed Sep 19, 2023
Configuration menu
View commit details

Copy full SHA for aa8201f

Browse repository at this point
Copy the full SHA

aa8201f View commit details

Browse the repository at this point in the history
fix import bug and delete uselss code

Xu-Kai committed Sep 19, 2023
Configuration menu
View commit details

Copy full SHA for 8c30608

Browse repository at this point
Copy the full SHA

8c30608 View commit details

Browse the repository at this point in the history
change colossalai/gptq to colossalai/quant/gptq

Xu-Kai committed Sep 19, 2023
Configuration menu
View commit details

Copy full SHA for c430416

Browse repository at this point
Copy the full SHA

c430416 View commit details

Browse the repository at this point in the history
update import linear for tests

Xu-Kai committed Sep 19, 2023
Configuration menu
View commit details

Copy full SHA for 6f2159f

Browse repository at this point
Copy the full SHA

6f2159f View commit details

Browse the repository at this point in the history

Commits on Sep 20, 2023

delete useless code and mv gptq_kernel to kernel directory

Xu-Kai committed Sep 20, 2023
Configuration menu
View commit details

Copy full SHA for d4db1bf

Browse repository at this point
Copy the full SHA

d4db1bf View commit details

Browse the repository at this point in the history
Merge branch 'main' into rebase_gptq_to_main

Xu-Kai authored Sep 20, 2023
Configuration menu
View commit details

Copy full SHA for f085c54

Browse repository at this point
Copy the full SHA

f085c54 View commit details

Browse the repository at this point in the history

Commits on Sep 21, 2023

fix triton kernel

Xu-Kai committed Sep 21, 2023
Configuration menu
View commit details

Copy full SHA for ee16a32

Browse repository at this point
Copy the full SHA

ee16a32 View commit details

Browse the repository at this point in the history
add triton import

Xu-Kai committed Sep 21, 2023
Configuration menu
View commit details

Copy full SHA for 9d4d7ff

Browse repository at this point
Copy the full SHA

9d4d7ff View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature] add gptq for inference #4754

[feature] add gptq for inference #4754

Commits on Sep 19, 2023

Commits on Sep 20, 2023

Commits on Sep 21, 2023