-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature] add gptq for inference #4754
Merged
Merged
Commits on Sep 19, 2023
-
[gptq] add gptq kernel (hpcaitech#4416)
* add gptq * refactor code * fix tests * replace auto-gptq * rname inferance/quant * refactor test * add auto-gptq as an option * reset requirements * change assert and check auto-gptq * add import warnings * change test flash attn version * remove example * change requirements of flash_attn * modify tests * [skip ci] change requirements-test
Configuration menu - View commit details
-
Copy full SHA for 08b928b - Browse repository at this point
Copy the full SHA 08b928bView commit details -
[gptq] faster gptq cuda kernel (hpcaitech#4494)
* [skip ci] add cuda kernels * add license * [skip ci] fix max_input_len * format files & change test size * [skip ci]
Configuration menu - View commit details
-
Copy full SHA for 5bd381d - Browse repository at this point
Copy the full SHA 5bd381dView commit details -
[gptq] add gptq tensor parallel (hpcaitech#4538)
* add gptq tensor parallel * add gptq tp * delete print * add test gptq check * add test auto gptq check
Configuration menu - View commit details
-
Copy full SHA for 145ff94 - Browse repository at this point
Copy the full SHA 145ff94View commit details -
[gptq] combine gptq and kv cache manager (hpcaitech#4706)
* combine gptq and kv cache manager * add init bits * delete useless code * add model path * delete usless print and update test * delete usless import * move option gptq to shard config
Configuration menu - View commit details
-
Copy full SHA for aefe767 - Browse repository at this point
Copy the full SHA aefe767View commit details -
Configuration menu - View commit details
-
Copy full SHA for 27b48b3 - Browse repository at this point
Copy the full SHA 27b48b3View commit details -
Configuration menu - View commit details
-
Copy full SHA for d896733 - Browse repository at this point
Copy the full SHA d896733View commit details -
Configuration menu - View commit details
-
Copy full SHA for aa8201f - Browse repository at this point
Copy the full SHA aa8201fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8c30608 - Browse repository at this point
Copy the full SHA 8c30608View commit details -
Configuration menu - View commit details
-
Copy full SHA for c430416 - Browse repository at this point
Copy the full SHA c430416View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6f2159f - Browse repository at this point
Copy the full SHA 6f2159fView commit details
Commits on Sep 20, 2023
-
Configuration menu - View commit details
-
Copy full SHA for d4db1bf - Browse repository at this point
Copy the full SHA d4db1bfView commit details -
Configuration menu - View commit details
-
Copy full SHA for f085c54 - Browse repository at this point
Copy the full SHA f085c54View commit details
Commits on Sep 21, 2023
-
Configuration menu - View commit details
-
Copy full SHA for ee16a32 - Browse repository at this point
Copy the full SHA ee16a32View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9d4d7ff - Browse repository at this point
Copy the full SHA 9d4d7ffView commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.