Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not infer quantized model, but fp32 works well. #20

Open
znsoftm opened this issue Jul 12, 2023 · 9 comments
Open

Can not infer quantized model, but fp32 works well. #20

znsoftm opened this issue Jul 12, 2023 · 9 comments

Comments

@znsoftm
Copy link

znsoftm commented Jul 12, 2023

run main.exe with a q4_0 quantized model.

for q4_0, it complains assert in ggml.c

in function static void ggml_compute_forward_soft_max_f32(

line 9342: assert(sum > 0.0); sum is -nan (ind)

@znsoftm
Copy link
Author

znsoftm commented Jul 12, 2023

overflowed?

@znsoftm
Copy link
Author

znsoftm commented Jul 12, 2023

But q4_1 works well.

@skeskinen
Copy link
Owner

nan results are typically a sign of some float accuracy weirdness. Do you have a very small model? I think the quantization is less accurate the smaller your model is.

@znsoftm
Copy link
Author

znsoftm commented Jul 12, 2023

get a quntized model from this model: multi-qa-MiniLM-L6-cos-v1 on hugging face.

@znsoftm
Copy link
Author

znsoftm commented Jul 12, 2023

I modify the code to adapt to BertCode with the latest ggml, it works fine. Maybe it can be solve by upgrading GGML?

@znsoftm
Copy link
Author

znsoftm commented Jul 12, 2023

nan results are typically a sign of some float accuracy weirdness. Do you have a very small model? I think the quantization is less accurate the smaller your model is.
ggml-model-q4_0.zip

@appvoid
Copy link

appvoid commented Jul 17, 2023

run main.exe with a q4_0 quantized model.

for q4_0, it complains assert in ggml.c

in function static void ggml_compute_forward_soft_max_f32(

line 9342: assert(sum > 0.0); sum is -nan (ind)

Can you please tell me how did you manage to make it work on Windows???

@znsoftm
Copy link
Author

znsoftm commented Jul 17, 2023

I have pulled a request and the repo owner has merged it. Git pull to get a new version, it works on Windows.

@appvoid
Copy link

appvoid commented Jul 17, 2023

Thanks, it's working now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants