Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k-quant causes builds to fail on ARM32 #2075

Closed
laura-a-n-n opened this issue Jul 2, 2023 · 1 comment · Fixed by #2920
Closed

k-quant causes builds to fail on ARM32 #2075

laura-a-n-n opened this issue Jul 2, 2023 · 1 comment · Fixed by #2920

Comments

@laura-a-n-n
Copy link

Expected Behavior

Build using make.

Current Behavior

Build fails with error, but adding LLAMA_NO_K_QUANTS=1 allows build to succeed.

Environment and Context

  • Commit d7d2e6a0f0c74f7a570dae384dfff371ac744d2a

  • CPU Architecture

$ lscpu
Architecture:                    armv7l
Byte Order:                      Little Endian
CPU(s):                          4
On-line CPU(s) list:             0-3
Thread(s) per core:              1
Core(s) per socket:              4
Socket(s):                       1
Vendor ID:                       ARM
Model:                           4
Model name:                      Cortex-A53
Stepping:                        r0p4
CPU max MHz:                     1200.0000
CPU min MHz:                     600.0000
BogoMIPS:                        38.40
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Mmio stale data:   Not affected
Vulnerability Retbleed:          Not affected
Vulnerability Spec store bypass: Not affected
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
Vulnerability Spectre v2:        Not affected
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           half thumb fastmult vfp edsp neon vfpv3 tls vfp
                                 v4 idiva idivt vfpd32 lpae evtstrm crc32
  • Operating System: Linux raspberrypi3 6.1.21-v7+ #1642 SMP Mon Apr 3 17:20:52 BST 2023 armv7l GNU/Linux

  • SDK version:

$ python3 --version
Python 3.9.2
$ make --version
GNU Make 4.3
Built for arm-unknown-linux-gnueabihf
$ g++ --version
g++ (Raspbian 10.2.1-6+rpi1) 10.2.1 20210110

Steps to Reproduce

  1. git clone https://github.com/ggerganov/llama.cpp
  2. cd llama.cpp
  3. make (produces error)
  4. make clean && LLAMA_NO_K_QUANTS=1 make (succeds)

Failure Logs

~/llama.cpp $ make
I llama.cpp build info: 
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  armv7l
I CFLAGS:   -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -DGGML_USE_K_QUANTS -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -DGGML_USE_K_QUANTS
I LDFLAGS:  
I CC:       cc (Raspbian 10.2.1-6+rpi1) 10.2.1 20210110
I CXX:      g++ (Raspbian 10.2.1-6+rpi1) 10.2.1 20210110

(...)

g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -DGGML_USE_K_QUANTS -c examples/common.cpp -o common.o
cc -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -DGGML_USE_K_QUANTS -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations   -c -o k_quants.o k_quants.c
k_quants.c: In function ‘ggml_vec_dot_q2_K_q8_K’:
k_quants.c:1272:36: warning: implicit declaration of function ‘vld1q_s16_x2’; did you mean ‘vld1q_s16’? [-Wimplicit-function-declaration]
 1272 |         const int16x8x2_t q8sums = vld1q_s16_x2(y[i].bsums);
      |                                    ^~~~~~~~~~~~
      |                                    vld1q_s16
k_quants.c:1272:36: error: invalid initializer
k_quants.c:1273:36: warning: missing braces around initializer [-Wmissing-braces]
 1273 |         const int16x8x2_t mins16 = {vreinterpretq_s16_u16(vmovl_u8(vget_low_u8(mins))), vreinterpretq_s16_u16(vmovl_u8(vget_high_u8(mins)))};
      |                                    ^
      |                                     {                                                                                                      }
k_quants.c:1278:23: warning: implicit declaration of function ‘vaddvq_s32’; did you mean ‘vaddq_s32’? [-Wimplicit-function-declaration]
 1278 |         sum += dmin * vaddvq_s32(vaddq_s32(s0, s1));
      |                       ^~~~~~~~~~
      |                       vaddq_s32
k_quants.c:1309:41: warning: implicit declaration of function ‘vld1q_u8_x2’; did you mean ‘vld1q_u32’? [-Wimplicit-function-declaration]
 1309 |             const uint8x16x2_t q2bits = vld1q_u8_x2(q2); q2 += 32;
      |                                         ^~~~~~~~~~~
      |                                         vld1q_u32
k_quants.c:1309:41: error: invalid initializer
k_quants.c:1311:35: warning: implicit declaration of function ‘vld1q_s8_x2’; did you mean ‘vld1q_s32’? [-Wimplicit-function-declaration]
 1311 |             int8x16x2_t q8bytes = vld1q_s8_x2(q8); q8 += 32;
      |                                   ^~~~~~~~~~~
      |                                   vld1q_s32
k_quants.c:1311:35: error: invalid initializer
k_quants.c:1296:13: warning: implicit declaration of function ‘vaddvq_s16’; did you mean ‘vaddq_s16’? [-Wimplicit-function-declaration]
 1296 |     isum += vaddvq_s16(p1) * aux[is+(index)] + vaddvq_s16(p2) * aux[is+1+(index)];\
      |             ^~~~~~~~~~
k_quants.c:1314:13: note: in expansion of macro ‘MULTIPLY_ACCUM_WITH_SCALE’
 1314 |             MULTIPLY_ACCUM_WITH_SCALE(0);
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~
k_quants.c:1301:19: error: incompatible types when assigning to type ‘int8x16x2_t’ from type ‘int1301 |         q8bytes = vld1q_s8_x2(q8); q8 += 32;\
      |                   ^~~~~~~~~~~
k_quants.c:1316:13: note: in expansion of macro ‘SHIFT_MULTIPLY_ACCUM_WITH_SCALE’
 1316 |             SHIFT_MULTIPLY_ACCUM_WITH_SCALE(2, 2);
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
k_quants.c:1301:19: error: incompatible types when assigning to type ‘int8x16x2_t’ from type ‘int1301 |         q8bytes = vld1q_s8_x2(q8); q8 += 32;\
      |                   ^~~~~~~~~~~
k_quants.c:1318:13: note: in expansion of macro ‘SHIFT_MULTIPLY_ACCUM_WITH_SCALE’
 1318 |             SHIFT_MULTIPLY_ACCUM_WITH_SCALE(4, 4);
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
k_quants.c:1301:19: error: incompatible types when assigning to type ‘int8x16x2_t’ from type ‘int1301 |         q8bytes = vld1q_s8_x2(q8); q8 += 32;\
      |                   ^~~~~~~~~~~
k_quants.c:1320:13: note: in expansion of macro ‘SHIFT_MULTIPLY_ACCUM_WITH_SCALE’
 1320 |             SHIFT_MULTIPLY_ACCUM_WITH_SCALE(6, 6);
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
k_quants.c:1251:22: warning: unused variable ‘vzero’ [-Wunused-variable]
 1251 |     const int32x4_t  vzero = vdupq_n_s32(0);
      |                      ^~~~~
k_quants.c: In function ‘ggml_vec_dot_q3_K_q8_K’:
k_quants.c:1746:31: error: invalid initializer
 1746 |         uint8x16x2_t qhbits = vld1q_u8_x2(qh);
      |                               ^~~~~~~~~~~
k_quants.c:1764:41: error: invalid initializer
 1764 |             const uint8x16x2_t q3bits = vld1q_u8_x2(q3); q3 += 32;
      |                                         ^~~~~~~~~~~
k_quants.c:1765:43: warning: implicit declaration of function ‘vld1q_s8_x4’; did you mean ‘vld1q_s64’? [-Wimplicit-function-declaration]
 1765 |             const int8x16x4_t q8bytes_1 = vld1q_s8_x4(q8); q8 += 64;
      |                                           ^~~~~~~~~~~
      |                                           vld1q_s64
k_quants.c:1765:43: error: invalid initializer
k_quants.c:1766:43: error: invalid initializer
 1766 |             const int8x16x4_t q8bytes_2 = vld1q_s8_x4(q8); q8 += 64;
      |                                           ^~~~~~~~~~~
k_quants.c: In function ‘ggml_vec_dot_q4_K_q8_K’:
k_quants.c:2380:34: warning: implicit declaration of function ‘vpaddq_s16’; did you mean ‘vpaddlq_s16’? [-Wimplicit-function-declaration]
 2380 |         const int16x8_t q8sums = vpaddq_s16(vld1q_s16(y[i].bsums), vld1q_s16(y[i].bsums + 8));
      |                                  ^~~~~~~~~~
      |                                  vpaddlq_s16
k_quants.c:2380:34: error: incompatible types when initializing type ‘int16x8_tusing type ‘int’
k_quants.c:2405:41: error: invalid initializer
 2405 |             const uint8x16x2_t q4bits = vld1q_u8_x2(q4); q4 += 32;
      |                                         ^~~~~~~~~~~
k_quants.c:2423:23: error: incompatible types when assigning to type ‘int8x16x2_t’ from type ‘int2423 |             q8bytes = vld1q_s8_x2(q8); q8 += 32;
      |                       ^~~~~~~~~~~
k_quants.c:2432:23: error: incompatible types when assigning to type ‘int8x16x2_t’ from type ‘int2432 |             q8bytes = vld1q_s8_x2(q8); q8 += 32;
      |                       ^~~~~~~~~~~
k_quants.c: In function ‘ggml_vec_dot_q5_K_q8_K’:
k_quants.c:2857:34: error: incompatible types when initializing type ‘int16x8_tusing type ‘int2857 |         const int16x8_t q8sums = vpaddq_s16(vld1q_s16(y[i].bsums), vld1q_s16(y[i].bsums + 8));
      |                                  ^~~~~~~~~~
k_quants.c:2878:31: error: invalid initializer
 2878 |         uint8x16x2_t qhbits = vld1q_u8_x2(qh);
      |                               ^~~~~~~~~~~
k_quants.c:2886:41: error: invalid initializer
 2886 |             const uint8x16x2_t q5bits = vld1q_u8_x2(q5); q5 += 32;
      |                                         ^~~~~~~~~~~
k_quants.c:2887:41: error: invalid initializer
 2887 |             const int8x16x4_t q8bytes = vld1q_s8_x4(q8); q8 += 64;
      |                                         ^~~~~~~~~~~
k_quants.c:2844:21: warning: unused variable ‘mzero’ [-Wunused-variable]
 2844 |     const int32x4_t mzero = vdupq_n_s32(0);
      |                     ^~~~~
k_quants.c: In function ‘ggml_vec_dot_q6_K_q8_K’:
k_quants.c:3370:36: error: invalid initializer
 3370 |         const int16x8x2_t q8sums = vld1q_s16_x2(y[i].bsums);
      |                                    ^~~~~~~~~~~~
k_quants.c:3372:38: warning: missing braces around initializer [-Wmissing-braces]
 3372 |         const int16x8x2_t q6scales = {vmovl_s8(vget_low_s8(scales)), vmovl_s8(vget_high_s8(scales))};
      |                                      ^
      |                                       {                                                            }
k_quants.c:3384:35: error: invalid initializer
 3384 |             uint8x16x2_t qhbits = vld1q_u8_x2(qh); qh += 32;
      |                                   ^~~~~~~~~~~
k_quants.c:3385:35: warning: implicit declaration of function ‘vld1q_u8_x4’; did you mean ‘vld1q_u64’? [-Wimplicit-function-declaration]
 3385 |             uint8x16x4_t q6bits = vld1q_u8_x4(q6); q6 += 64;
      |                                   ^~~~~~~~~~~
      |                                   vld1q_u64
k_quants.c:3385:35: error: invalid initializer
k_quants.c:3386:35: error: invalid initializer
 3386 |             int8x16x4_t q8bytes = vld1q_s8_x4(q8); q8 += 64;
      |                                   ^~~~~~~~~~~
k_quants.c:3429:23: error: incompatible types when assigning to type ‘int8x16x4_t’ from type ‘int3429 |             q8bytes = vld1q_s8_x4(q8); q8 += 64;
      |                       ^~~~~~~~~~~
k_quants.c:3352:22: warning: unused variable ‘vzero’ [-Wunused-variable]
 3352 |     const int32x4_t  vzero = vdupq_n_s32(0);
      |                      ^~~~~
make: *** [<builtin>: k_quants.o] Error 1
@laura-a-n-n laura-a-n-n changed the title k-quant causes builds to fail on ARM64 k-quant causes builds to fail on ARM32 Jul 2, 2023
@mqy
Copy link
Contributor

mqy commented Jul 3, 2023

I found a simlar issue, FYI #1210
BTW, did you try cmake?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
@mqy @laura-a-n-n and others