-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fail to start awq quantized model with lightllm on qwen2-7b-instruct #56
Comments
You need to use symmetric mode for awq, if you want to do inference with lightllm. Additionally, remove the act part. |
I changed to use following awq config, still get the same error
|
You must still use per-group quantization with 128 group size in llmc to fit the backend kernel. |
after changed the config, still get the same error
|
Hi, granularity is not per_channel. You should adjust to per_group. |
still get the same error after change to per_group
|
Did you get the error with lightllm quantization mode after fixing the config for llmc? If you do not use quantization kernel, weight clipping in awq makes this reasonable. |
I tried with/without "--mode triton_w4a16" to start lightllm, both get the same error |
We will try to reproduce the error later. Just wait for some time. You can also try other algorithms. |
I get the same error for QuaRot
|
awq config
start with lightllm
test
get error in lightllm
PS: I get similar error if start lightllm with option "--mode triton_w4a16"
The text was updated successfully, but these errors were encountered: