Skip to content

chenfucn/gemmlib

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GemmLib

This repo contains a blockwise 4-bit quantized gemm kernel optimized for Nvidia Ampere GPUs. Code of this gemm kernel, together with its quantization kernel is located under directory src

We also provide a pytorch extension so that the kernel can be used in pytorch.

Usage

To build the library that contains the gemm kernel and quantization code, clone this repo, and under the root directory of the local repo:

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Debug ..
make

Run ms_blkq4gemm_test to test the correctness of the code.

To build the pytorch extension, change to the root directory of the repo:

python python/setup.py install --user

Python file python/blkq4linear_test.py contains usage examples.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published