Skip to content

lichangh20/Cuda_LSS_HQ

Repository files navigation

Hardware Implemented LSS+HQ

Code for hardware implementation LSS and HQ operator.

INSTALL

Tested with PyTorch 1.12.1 + CUDA 11.3, on an Tesla A100 GPU.

Note: This cuda program is based on Nvidia cutlass version 2.10. You need to pull down the corresponding version library. Besides, in quantize_forward_HQ/setup_easy.py and quantize_grad_weight_LSS/setup.py, you need to change the path of include_dirs into the absolute path on your own computer, only in this way can it work normally.

CutLass

git clone git@github.com:NVIDIA/cutlass.git 
#checkout branch
git checkout feature/2.10/updates_before_tagging

LSS

cd quantize_grad_weight_LSS
python setup.py install

HQ

cd quantize_forward_HQ
python setup_easy.py install

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published