Introduce structs for the q4 data blocks #356

sw · 2023-03-21T15:29:21Z

This came up with the AVX512 support in #320.

The code handling the q4 data blocks has quite a lot of pointer wrangling and sizeofs. I understand that this may have been out of a concern for optimizability by the compiler. But I think the code could be made more readable by introducing structs for the q4_0 and q4_1 data blocks, without changing the overall structure too much.

I have tested my changes only on Linux and with the AVX2 and scalar implementations, and it's easy to misplace an x for a y... so beware ;-)

I have also updated utils.cpp to use the quantize_row_q4_0/1 functions, but this isn't really clean at the moment. It also will cause a difference in the generated q4_0 model file for people using one of the SIMD-optimized variants. The model should work fine but it may be confusing for people who want to check the file's hash, so I'm open to reverting that.
q4_1 is not affected as that has no processor-specific optimizations in the quantization function.

I'm open to suggestions, also on naming and code style, which I tried to keep locally consistent. Should the block structs be made public in ggml.h?

ggerganov · 2023-03-22T05:41:44Z

With the latest changes #370 the quantize methods from quantize.cpp are now moved into ggml.c.
I think it would be a good first step to avoid the code duplication between:

quantize_row_q4_0() and ggml_quantize_q4_0()
quantize_row_q4_1() and ggml_quantize_q4_1()

I.e. make the latter call the former.
This will reduce duplicated code and also drastically improve performance of quantize.cpp

After this change has been merged, then we can think about wrapping things into struct for better readability.

sw · 2023-03-22T20:12:14Z

Rebased and verified identical model files and prediction output

ggerganov · 2023-03-23T17:35:34Z

Ok, thanks. I know @blackhole89 is working on quantization improvements on a branch - would this change be too big of a conflict for you? We can postpone it for later if necessary.

I think we should also test it on ARM that something is not broken. Will try to find time in the following days

sw · 2023-03-23T18:32:53Z

I consider it ready now but I'm open to revising it or waiting for other changes. I agree that it should be tested on ARM and AVX512.

blackhole89 · 2023-03-23T20:21:35Z

@ggerganov Thanks for paging me in! As far as I can see, there would be no conflict with anything I'm doing. I think this is a good change in terms of code readability - just need to make sure we don't run into weird platform-specific problems due to struct alignment or something.

ggerganov · 2023-03-28T15:55:38Z

@sw I changed the variable names and fixed the ARM_NEON and WASM builds
We haven't tested the WASM and POWER9 paths but hopefully it works

Give it one more try after my changes to make sure that AVX is fine. I'll merge this now

gjmulder added the enhancement New feature or request label Mar 21, 2023

sw force-pushed the q4-struct branch from b637d2d to 0dc6b7c Compare March 21, 2023 18:55

sw mentioned this pull request Mar 22, 2023

Deduplicate q4 quantization functions #383

Merged

sw force-pushed the q4-struct branch from 0dc6b7c to 1f6ab16 Compare March 22, 2023 20:08

sw force-pushed the q4-struct branch 2 times, most recently from 9221414 to 63ffb1a Compare March 23, 2023 18:29

sw marked this pull request as ready for review March 23, 2023 18:32

sw force-pushed the q4-struct branch from 63ffb1a to 964b082 Compare March 25, 2023 07:31

Introduce structs for the q4 data blocks

92d1021

sw force-pushed the q4-struct branch from 964b082 to 92d1021 Compare March 26, 2023 13:29

ggml : rename quant struct variables + fix ARM_NEON

6a3b29a

ggerganov merged commit c1f8850 into ggerganov:master Mar 28, 2023

sw deleted the q4-struct branch March 28, 2023 19:13

sw added a commit to sw/llama.cpp that referenced this pull request Apr 8, 2023

Fix wasm build after breaking it in ggerganov#356

d1315a3

This was referenced Apr 8, 2023

wasm simd 128 build failure regression #804

Closed

Fix wasm build after breaking it in #356 #901

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce structs for the q4 data blocks #356

Introduce structs for the q4 data blocks #356

sw commented Mar 21, 2023

ggerganov commented Mar 22, 2023

sw commented Mar 22, 2023

ggerganov commented Mar 23, 2023

sw commented Mar 23, 2023

blackhole89 commented Mar 23, 2023 •

edited

Loading

ggerganov commented Mar 28, 2023

Introduce structs for the q4 data blocks #356

Introduce structs for the q4 data blocks #356

Conversation

sw commented Mar 21, 2023

ggerganov commented Mar 22, 2023

sw commented Mar 22, 2023

ggerganov commented Mar 23, 2023

sw commented Mar 23, 2023

blackhole89 commented Mar 23, 2023 • edited Loading

ggerganov commented Mar 28, 2023

blackhole89 commented Mar 23, 2023 •

edited

Loading