Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release PyPi package + Create GitHub workflow #9

Merged
merged 39 commits into from
Sep 1, 2023
Merged

Conversation

casper-hansen
Copy link
Owner

  • expand on setup.py for pypi package
  • automatically build on new tag being pushed to help release pypi package
git tag v0.0.1
git push origin v0.0.1
  • create additional example
  • convert prints to logging
  • standardize asm calls in cuda kernel

@casper-hansen
Copy link
Owner Author

casper-hansen commented Aug 28, 2023

It seems like the build fails silently on multiple kernels.

logs.txt

pos_encoding_kernels.cu

Command:

nvcc  \
-I/usr/share/miniconda/envs/build/lib/python3.8/site-packages/torch/include \
-I/usr/share/miniconda/envs/build/lib/python3.8/site-packages/torch/include/torch/csrc/api/include \
-I/usr/share/miniconda/envs/build/lib/python3.8/site-packages/torch/include/TH \
-I/usr/share/miniconda/envs/build/lib/python3.8/site-packages/torch/include/THC \
-I/usr/share/miniconda/envs/build/include \
-I/usr/share/miniconda/envs/build/include/python3.8 \
-c /home/runner/work/AutoAWQ/AutoAWQ/awq_cuda/position_embedding/pos_encoding_kernels.cu \
-o /home/runner/work/AutoAWQ/AutoAWQ/build/temp.linux-x86_64-cpython-38/awq_cuda/position_embedding/pos_encoding_kernels.o \
-D__CUDA_NO_HALF_OPERATORS__ \
-D__CUDA_NO_HALF_CONVERSIONS__ \
-D__CUDA_NO_BFLOAT16_CONVERSIONS__ \
-D__CUDA_NO_HALF2_OPERATORS__ \
--expt-relaxed-constexpr \
--compiler-options '-fPIC' \
-O3 \
-std=c++17 \
--threads 2 \
-gencode arch=compute_80,code=sm_80 \
-gencode arch=compute_89,code=sm_89 \
-gencode arch=compute_90,code=sm_90 \
-gencode arch=compute_86,code=sm_86 \
-DTORCH_API_INCLUDE_EXTENSION_H \
-DPYBIND11_COMPILER_TYPE=\"_gcc\" \
-DPYBIND11_STDLIB=\"_libstdcpp\" \
-DPYBIND11_BUILD_ABI=\"_cxxabi1011\" \
-DTORCH_EXTENSION_NAME=awq_inference_engine \
-D_GLIBCXX_USE_CXX11_ABI=0

gemm_cuda_gen.cu

Command:

nvcc  \
-I/usr/share/miniconda/envs/build/lib/python3.9/site-packages/torch/include \
-I/usr/share/miniconda/envs/build/lib/python3.9/site-packages/torch/include/torch/csrc/api/include \
-I/usr/share/miniconda/envs/build/lib/python3.9/site-packages/torch/include/TH \
-I/usr/share/miniconda/envs/build/lib/python3.9/site-packages/torch/include/THC \
-I/usr/share/miniconda/envs/build/include \
-I/usr/share/miniconda/envs/build/include/python3.9 \
-c /home/runner/work/AutoAWQ/AutoAWQ/awq_cuda/quantization/gemm_cuda_gen.cu \
-o /home/runner/work/AutoAWQ/AutoAWQ/build/temp.linux-x86_64-cpython-39/awq_cuda/quantization/gemm_cuda_gen.o \
-D__CUDA_NO_HALF_OPERATORS__ \
-D__CUDA_NO_HALF_CONVERSIONS__ \
-D__CUDA_NO_BFLOAT16_CONVERSIONS__ \
-D__CUDA_NO_HALF2_OPERATORS__ \
--expt-relaxed-constexpr \
--compiler-options '-fPIC' \
-O3 \
-std=c++17 \
--threads 2 \
-gencode arch=compute_80,code=sm_80 \
-gencode arch=compute_89,code=sm_89 \
-gencode arch=compute_90,code=sm_90 \
-gencode arch=compute_86,code=sm_86 \
-DTORCH_API_INCLUDE_EXTENSION_H \
-DPYBIND11_COMPILER_TYPE=\"_gcc\" \
-DPYBIND11_STDLIB=\"_libstdcpp\" \
-DPYBIND11_BUILD_ABI=\"_cxxabi1011\" \
-DTORCH_EXTENSION_NAME=awq_inference_engine \
-D_GLIBCXX_USE_CXX11_ABI=0

RunPod NVCC

This seems to work without problems.

nvcc \
-I/usr/local/lib/python3.10/dist-packages/torch/include \
-I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include \
-I/usr/local/lib/python3.10/dist-packages/torch/include/TH \
-I/usr/local/lib/python3.10/dist-packages/torch/include/THC \
-I/usr/local/cuda/include \
-I/usr/include/python3.10 \
-c awq_cuda/position_embedding/pos_encoding_kernels.cu \
-o build/temp.linux-x86_64-cpython-310/awq_cuda/position_embedding/pos_encoding_kernels.o \
-D__CUDA_NO_HALF_OPERATORS__ \
-D__CUDA_NO_HALF_CONVERSIONS__ \
-D__CUDA_NO_BFLOAT16_CONVERSIONS__ \
-D__CUDA_NO_HALF2_OPERATORS__ \
--expt-relaxed-constexpr \
--compiler-options '-fPIC' \
-O3 \
-std=c++17 \
--threads 2 \
-gencode arch=compute_80,code=sm_80 \
-gencode arch=compute_89,code=sm_89 \
-gencode arch=compute_90,code=sm_90 \
-gencode arch=compute_86,code=sm_86 \
-DTORCH_API_INCLUDE_EXTENSION_H \
-DPYBIND11_COMPILER_TYPE=\"_gcc\" \
-DPYBIND11_STDLIB=\"_libstdcpp\" \
-DPYBIND11_BUILD_ABI=\"_cxxabi1011\" \
-DTORCH_EXTENSION_NAME=awq_inference_engine \
-D_GLIBCXX_USE_CXX11_ABI=0 

@casper-hansen
Copy link
Owner Author

casper-hansen commented Aug 29, 2023

Releasing the package is getting close, however, only Linux will be supported at this time due to building on Windows failing. I tried the following code in setup.py, but it did not end up fixing the build.

if os.name == "nt":
    # On Windows, fix an error LNK2001: unresolved external symbol cublasHgemm bug in the compilation
    cuda_path = os.environ.get("CUDA_PATH", None)
    if cuda_path is None:
        raise ValueError("The environment variable CUDA_PATH must be set to the path to the CUDA install when installing from source on Windows systems.")

    pytorch_dir = get_python_lib() + '/torch'
    extra_link_args = ["-L", f"{cuda_path}/lib/x64/cudart.lib", 
                       "-L", f"{cuda_path}/lib/x64/cublas.lib", 
                       "-L", f'{pytorch_dir}/lib/libtorch.lib']
    
    print('Windows: Detected CUDA path, adding extra link args')
    print(extra_link_args)
else:
    extra_link_args = []

extensions = [
    cpp_extension.CppExtension(
        "awq_inference_engine",
        [
            "awq_cuda/pybind.cpp",
            "awq_cuda/quantization/gemm_cuda_gen.cu",
            "awq_cuda/layernorm/layernorm.cu",
            "awq_cuda/position_embedding/pos_encoding_kernels.cu"
        ], extra_compile_args={
            "cxx": ["-g", "-O3", "-fopenmp", "-lgomp", "-std=c++17"],
            "nvcc": ["-O3", "-std=c++17"]
        }, extra_link_args=extra_link_args
    )

@casper-hansen casper-hansen marked this pull request as ready for review August 29, 2023 22:02
@casper-hansen
Copy link
Owner Author

Now that Linux and Windows support is added, it is ready to be merged.

@casper-hansen casper-hansen merged commit f0eba43 into main Sep 1, 2023
@casper-hansen casper-hansen deleted the release_package branch September 2, 2023 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants