GemmLib

This repo contains a blockwise 4-bit quantized gemm kernel optimized for Nvidia Ampere GPUs. Code of this gemm kernel, together with its quantization kernel is located under directory src

We also provide a pytorch extension so that the kernel can be used in pytorch.

Usage

To build the library that contains the gemm kernel and quantization code, clone this repo, and under the root directory of the local repo:

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Debug ..
make

Run ms_blkq4gemm_test to test the correctness of the code.

To build the pytorch extension, change to the root directory of the repo:

python python/setup.py install --user

Python file python/blkq4linear_test.py contains usage examples.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
include		include
python		python
src		src
test		test
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GemmLib

Usage

About

Releases

Packages

Languages

chenfucn/gemmlib

Folders and files

Latest commit

History

Repository files navigation

GemmLib

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages