Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Organize vendor-specific headers into vendors directory #8746

Merged
merged 1 commit into from
Jul 29, 2024

Conversation

yeahdongcn
Copy link
Contributor

As @slaren suggested in #8383, it is beneficial to organize vendor-specific headers separately. This PR creates a new vendors directory and adds cuda.h, hip.h, and musa.h for the three supported vendors.

Testing done

  • make GGML_MUSA=1 -> passed

@github-actions github-actions bot added the Nvidia GPU Issues specific to Nvidia GPUs label Jul 29, 2024
@slaren slaren merged commit 439b3fc into ggerganov:master Jul 29, 2024
53 checks passed
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Aug 2, 2024
@m828
Copy link

m828 commented Aug 27, 2024

Hello, I used the make GGML_MUSA=1 command to compile and got an error:

I ccache found, compilation results will be cached. Disable with GGML_NO_CCACHE.
I llama.cpp build info:
I UNAME_S: Linux
I UNAME_P: x86_64
I UNAME_M: x86_64
I CFLAGS: -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG -DGGML_USE_MUSA -DGGML_USE_OPENMP -I/usr/lib/llvm-10/include/openmp -DGGML_USE_LLAMAFILE -DGGML_USE_CUDA -I/usr/local/musa/include -std=c11 -fPIC -O3 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -pthread -march=native -mtune=native -fopenmp -Wunreachable-code-break -Wunreachable-code-return -Wdouble-promotion
I CXXFLAGS: -std=c++11 -fPIC -O3 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -pthread -fopenmp -march=native -mtune=native -Wunreachable-code-break -Wunreachable-code-return -Wmissing-prototypes -Wextra-semi -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG -DGGML_USE_MUSA -DGGML_USE_OPENMP -I/usr/lib/llvm-10/include/openmp -DGGML_USE_LLAMAFILE -DGGML_USE_CUDA -I/usr/local/musa/include
I NVCCFLAGS: -std=c++11 -O3 -g -x musa -mtgpu --cuda-gpu-arch=mp_22 -arch=native -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128
I LDFLAGS: -L/usr/lib/llvm-10/lib -lmusa -lmublas -lmusart -lpthread -ldl -lrt -L/usr/local/musa/lib -L/usr/lib64
I CC: clang version 14.0.0 ([email protected]:mthreads/mtcc.git 228d4651d8fcb8511ca196a5740eef83326ce1cb)
I CXX: clang version 14.0.0 ([email protected]:mthreads/mtcc.git 228d4651d8fcb8511ca196a5740eef83326ce1cb)
I NVCC: InstalledDir: /usr/local/musa/bin

/usr/bin/ccache mcc -std=c++11 -O3 -g -x musa -mtgpu --cuda-gpu-arch=mp_22 -arch=native -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG -DGGML_USE_MUSA -DGGML_USE_OPENMP -I/usr/lib/llvm-10/include/openmp -DGGML_USE_LLAMAFILE -DGGML_USE_CUDA -I/usr/local/musa/include -c ggml/src/ggml-cuda/mmvq.cu -o ggml/src/ggml-cuda/mmvq.o
clang-14: warning: argument unused during compilation: '-arch=native' [-Wunused-command-line-argument]
In file included from ggml/src/ggml-cuda/mmvq.cu:1:
In file included from ggml/src/ggml-cuda/mmvq.cuh:1:
ggml/src/ggml-cuda/common.cuh:164:1: warning: function declared 'noreturn' should not return [-Winvalid-noreturn]
}
^
ggml/src/ggml-cuda/common.cuh:268:1: warning: non-void function does not return a value [-Wreturn-type]
}
^
In file included from :1:
In file included from /usr/local/musa-2.0.0/lib/clang/14.0.0/include/__clang_musa_runtime_wrapper.h:169:
/usr/local/musa-2.0.0/lib/clang/14.0.0/include/__clang_musa_device_functions.h:2293:11: error: couldn't allocate output register for constraint 'r'
asm("vsub4.s32.s32.s32.sat %0,%1,%2,%3;"
^
/usr/local/musa-2.0.0/lib/clang/14.0.0/include/__clang_musa_device_functions.h:2293:11: error: couldn't allocate output register for constraint 'r'
/usr/local/musa-2.0.0/lib/clang/14.0.0/include/__clang_musa_device_functions.h:2293:11: error: couldn't allocate output register for constraint 'r'
/usr/local/musa-2.0.0/lib/clang/14.0.0/include/__clang_musa_device_functions.h:2293:11: error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
error: couldn't allocate output register for constraint 'r'
fatal error: too many errors emitted, stopping now [-ferror-limit=]
2 warnings and 20 errors generated when compiling for mp_22.
make: *** [Makefile:736: ggml/src/ggml-cuda/mmvq.o] Error 1

What is the reason?

@yeahdongcn
Copy link
Contributor Author

What is the reason?

Which version of MUSA Toolkits are you using?
Feel free to contact me through WeChat: yeahdongcn

@m828
Copy link

m828 commented Aug 27, 2024

What is the reason?

Which version of MUSA Toolkits are you using? Feel free to contact me through WeChat: yeahdongcn

好的,已加您微信

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Nvidia GPU Issues specific to Nvidia GPUs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants