The repository is based on SparseGPT code
torch
: tested on v2.2.1transformers
: tested on v4.35.2datasets
: tested on v2.16.1
The simplest way to try this out is to run the following command:
python inference_demo.py --execution_mode 1 --compressed_model_path elvircrn/llama2-7b-double-sparse-sparsity0.7-wikitext2-final --pretrained_model_path <Llama-2-7b-hf_path>