refactor: Organize vendor-specific headers into vendors directory #8746

yeahdongcn · 2024-07-29T06:10:05Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

As @slaren suggested in #8383, it is beneficial to organize vendor-specific headers separately. This PR creates a new vendors directory and adds cuda.h, hip.h, and musa.h for the three supported vendors.

Testing done

make GGML_MUSA=1 -> passed

Signed-off-by: Xiaodong Ye <[email protected]>

…anov#8746) Signed-off-by: Xiaodong Ye <[email protected]>

m828 · 2024-08-27T01:50:31Z

Hello, I used the make GGML_MUSA=1 command to compile and got an error:

I ccache found, compilation results will be cached. Disable with GGML_NO_CCACHE.
I llama.cpp build info:
I UNAME_S: Linux
I UNAME_P: x86_64
I UNAME_M: x86_64
I CFLAGS: -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG -DGGML_USE_MUSA -DGGML_USE_OPENMP -I/usr/lib/llvm-10/include/openmp -DGGML_USE_LLAMAFILE -DGGML_USE_CUDA -I/usr/local/musa/include -std=c11 -fPIC -O3 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -pthread -march=native -mtune=native -fopenmp -Wunreachable-code-break -Wunreachable-code-return -Wdouble-promotion
I CXXFLAGS: -std=c++11 -fPIC -O3 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -pthread -fopenmp -march=native -mtune=native -Wunreachable-code-break -Wunreachable-code-return -Wmissing-prototypes -Wextra-semi -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG -DGGML_USE_MUSA -DGGML_USE_OPENMP -I/usr/lib/llvm-10/include/openmp -DGGML_USE_LLAMAFILE -DGGML_USE_CUDA -I/usr/local/musa/include
I NVCCFLAGS: -std=c++11 -O3 -g -x musa -mtgpu --cuda-gpu-arch=mp_22 -arch=native -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128
I LDFLAGS: -L/usr/lib/llvm-10/lib -lmusa -lmublas -lmusart -lpthread -ldl -lrt -L/usr/local/musa/lib -L/usr/lib64
I CC: clang version 14.0.0 ([email protected]:mthreads/mtcc.git 228d4651d8fcb8511ca196a5740eef83326ce1cb)
I CXX: clang version 14.0.0 ([email protected]:mthreads/mtcc.git 228d4651d8fcb8511ca196a5740eef83326ce1cb)
I NVCC: InstalledDir: /usr/local/musa/bin

/usr/bin/ccache mcc -std=c++11 -O3 -g -x musa -mtgpu --cuda-gpu-arch=mp_22 -arch=native -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG -DGGML_USE_MUSA -DGGML_USE_OPENMP -I/usr/lib/llvm-10/include/openmp -DGGML_USE_LLAMAFILE -DGGML_USE_CUDA -I/usr/local/musa/include -c ggml/src/ggml-cuda/mmvq.cu -o ggml/src/ggml-cuda/mmvq.o
clang-14: warning: argument unused during compilation: '-arch=native' [-Wunused-command-line-argument]
In file included from ggml/src/ggml-cuda/mmvq.cu:1:
In file included from ggml/src/ggml-cuda/mmvq.cuh:1:
ggml/src/ggml-cuda/common.cuh:164:1: warning: function declared 'noreturn' should not return [-Winvalid-noreturn]
}
^
ggml/src/ggml-cuda/common.cuh:268:1: warning: non-void function does not return a value [-Wreturn-type]
}
^
In file included from :1:
In file included from /usr/local/musa-2.0.0/lib/clang/14.0.0/include/__clang_musa_runtime_wrapper.h:169:
/usr/local/musa-2.0.0/lib/clang/14.0.0/include/__clang_musa_device_functions.h:2293:11: error: couldn't allocate output register for constraint 'r'
asm("vsub4.s32.s32.s32.sat %0,%1,%2,%3;"
^
/usr/local/musa-2.0.0/lib/clang/14.0.0/include/__clang_musa_device_functions.h:2293:11: error: couldn't allocate output register for constraint 'r'
/usr/local/musa-2.0.0/lib/clang/14.0.0/include/__clang_musa_device_functions.h:2293:11: error: couldn't allocate output register for constraint 'r'
/usr/local/musa-2.0.0/lib/clang/14.0.0/include/__clang_musa_device_functions.h:2293:11: error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
fatal error: too many errors emitted, stopping now [-ferror-limit=]
2 warnings and 20 errors generated when compiling for mp_22.
make: *** [Makefile:736: ggml/src/ggml-cuda/mmvq.o] Error 1

What is the reason?

yeahdongcn · 2024-08-27T02:39:18Z

What is the reason?

Which version of MUSA Toolkits are you using?
Feel free to contact me through WeChat: yeahdongcn

m828 · 2024-08-27T03:00:47Z

What is the reason?

Which version of MUSA Toolkits are you using? Feel free to contact me through WeChat: yeahdongcn

好的，已加您微信

github-actions bot added the Nvidia GPU Issues specific to Nvidia GPUs label Jul 29, 2024

refactor: Organize vendor-specific headers into vendors directory

9aed263

Signed-off-by: Xiaodong Ye <[email protected]>

yeahdongcn force-pushed the vendors branch from 83e3c4b to 9aed263 Compare July 29, 2024 06:11

slaren approved these changes Jul 29, 2024

View reviewed changes

slaren merged commit 439b3fc into ggerganov:master Jul 29, 2024
53 checks passed

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Aug 2, 2024

cuda : organize vendor-specific headers into vendors directory (ggerg…

ed24dbc

…anov#8746) Signed-off-by: Xiaodong Ye <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: Organize vendor-specific headers into vendors directory #8746

refactor: Organize vendor-specific headers into vendors directory #8746

yeahdongcn commented Jul 29, 2024

m828 commented Aug 27, 2024

yeahdongcn commented Aug 27, 2024

m828 commented Aug 27, 2024

refactor: Organize vendor-specific headers into vendors directory #8746

refactor: Organize vendor-specific headers into vendors directory #8746

Conversation

yeahdongcn commented Jul 29, 2024

Testing done

m828 commented Aug 27, 2024

yeahdongcn commented Aug 27, 2024

m828 commented Aug 27, 2024