Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
mgoin authored Dec 21, 2023
1 parent 0eaf565 commit b6b091e
Showing 1 changed file with 1 addition and 7 deletions.
8 changes: 1 addition & 7 deletions src/sparseml/transformers/sparsification/obcq/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,7 @@ This page describes how to perform one-shot quantization of large language model

### Note on system requirements

Due to inefficiencies in PyTorch ONNX export, a lot of system memory is required to export the models for inference. Below are some current estimations for various model sizes using `torch==2.1`. There are [improvements coming in 2.2](https://github.com/pytorch/pytorch/commit/b4a49124c8165a374a3ef49e14807ac05b3fc030).

| Model Size | Required GPU VRAM | Required System RAM |
| ----------- | ----------- | ----------- |
| 1B Params | 16GB | 64GB |
| 7B Params | 32GB | ~300GB |
| 13B Params | 64GB | ~600GB |
Due to inefficiencies in PyTorch ONNX export, a lot of system memory is required to export the models for inference. There are [improvements coming in 2.2](https://github.com/pytorch/pytorch/commit/b4a49124c8165a374a3ef49e14807ac05b3fc030).

## Table of Contents
1. [How to Clone and Install the Latest SparseML](#clone)
Expand Down

0 comments on commit b6b091e

Please sign in to comment.