Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device Abstraction IFU #27

Merged
merged 56 commits into from
May 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
67e7ee3
first draft
stevhliu Mar 26, 2024
e3376ab
toctree
stevhliu Mar 26, 2024
c17fb8e
Fix 4bit quantization with blocksize=4096
matthewdouglas Mar 29, 2024
a471456
fix formatting for install_cuda.py
matthewdouglas Mar 29, 2024
494de20
Bump the minor-patch group with 1 update (#1162)
dependabot[bot] Apr 2, 2024
bed0860
Tests: improve memory usage (#1147)
matthewdouglas Apr 2, 2024
2965c76
CHANGELOG.md: mention accuracy changes when quantizing post v0.42
Titus-von-Koeller Apr 2, 2024
76885a4
Merge pull request #1160 from matthewdouglas/quant4bit-blocksize4096
Titus-von-Koeller Apr 2, 2024
bfe2118
README: include download badges
Titus-von-Koeller Apr 4, 2024
0c64a0d
Merge pull request #1148 from stevhliu/fsdp-qlora
Titus-von-Koeller Apr 5, 2024
b2a85a4
Update matplotlib requirement from ~=3.8.3 to ~=3.8.4 in the major group
dependabot[bot] Apr 8, 2024
c0ad874
Build workflow: Add CUDA 12.4 to build matrix
matthewdouglas Apr 8, 2024
ebac862
Exclude Windows from CUDA 12.4.0 build for now
matthewdouglas Apr 8, 2024
af9a073
Merge pull request #1171 from matthewdouglas/build-cu124
Titus-von-Koeller Apr 9, 2024
0c887b7
Merge pull request #1169 from TimDettmers/dependabot/pip/major-45b123…
Titus-von-Koeller Apr 9, 2024
6be3d0f
[docs] Install from source (#1149)
stevhliu Apr 9, 2024
c54053d
Bump scipy from 1.12.0 to 1.13.0 in the minor-patch group (#1170)
dependabot[bot] Apr 9, 2024
7449d71
[`Core`] Change 8-bit serialization weight format format (#1164)
younesbelkada Apr 10, 2024
d62516f
(backends) Stub out additional backends; move more functions to backe…
matthewdouglas Apr 11, 2024
13ad630
Add int8 ops for Intel CPU & XPU
Xia-Weiwen Apr 11, 2024
4743ff0
CHANGELOG: to reverse chron order + mdformat
Titus-von-Koeller Apr 11, 2024
0c33c0d
ignore CHANGELOG reordering + formatting commit
Titus-von-Koeller Apr 11, 2024
f92c536
CHANGELOG: add v0.43.1
Titus-von-Koeller Apr 11, 2024
4a6fb35
bump version to 0.43.1
Titus-von-Koeller Apr 11, 2024
7b0c4cd
small fix in changelog
Titus-von-Koeller Apr 11, 2024
127788a
bump version to next dev
Titus-von-Koeller Apr 11, 2024
77be40b
Remove XPU code; remove cpu example; add UT
Xia-Weiwen Apr 15, 2024
8d0b695
Fix igemmlt correctness issue
Xia-Weiwen Apr 15, 2024
6cecb65
Update pandas requirement from ~=2.2.1 to ~=2.2.2 in the major group …
dependabot[bot] Apr 17, 2024
ffd7d0d
(docs) integrations: fix omission in bf16 related warning (#1183)
Titus-von-Koeller Apr 17, 2024
67d8661
Bug fix for double_quant
Xia-Weiwen Apr 18, 2024
92900f6
Remove torch.compile for double_quant
Xia-Weiwen Apr 18, 2024
717245d
refine pytest.skip message
Xia-Weiwen Apr 19, 2024
93e04b5
Fix lint issues
Xia-Weiwen Apr 25, 2024
e1b60d3
Fix backward
Xia-Weiwen Apr 26, 2024
5b9ef77
Bump the minor-patch group with 2 updates (#1192)
dependabot[bot] Apr 30, 2024
7f13c8f
merge changes from main
Titus-von-Koeller May 3, 2024
95c29a6
Fix lint issue
Xia-Weiwen May 6, 2024
749e06f
Merge pull request #1173 from matthewdouglas/backend-stubs
Titus-von-Koeller May 6, 2024
b0dec0a
Update bitsandbytes/backends/cpu_xpu_common.py
Xia-Weiwen May 7, 2024
97e41b8
Merge remote-tracking branch 'upstream/multi-backend-refactor' into m…
Xia-Weiwen May 7, 2024
295bb97
Fix lint issue
Xia-Weiwen May 7, 2024
37b0582
Fix lint issue
Xia-Weiwen May 7, 2024
8561f09
Merge pull request #1178 from Xia-Weiwen/multi-backend-refactor-cpu-x…
Titus-von-Koeller May 7, 2024
2af8568
Merge remote-tracking branch 'upstream/multi-backend-refactor' into d…
pnunna93 May 9, 2024
06f6b25
skip linear no igemmlt test
pnunna93 May 9, 2024
2359452
Remove archive functional file
pnunna93 May 9, 2024
f76d6ab
Sync README with upstream
pnunna93 May 9, 2024
576b62c
Remove bnb_accuracy file
pnunna93 May 9, 2024
dfb531b
Remove cuda_setup
pnunna93 May 9, 2024
31b1cbc
Remove test_delete_later.c
pnunna93 May 9, 2024
ed77476
Sync with upstream
pnunna93 May 9, 2024
943c57a
Sync files with upstream
pnunna93 May 9, 2024
71d1702
Fix lint errors
pnunna93 May 10, 2024
6886bc8
Exclude hip files from typo checks
pnunna93 May 8, 2024
0d445f4
update ops.hip
pnunna93 May 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .git-blame-ignore-revs
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,6 @@ ea7c14f8ef64924f2d0ff80df3cdabf2c7299848

# Reformat with ruff-format
5a4263f4dc05fe8f78f4111beab9f68a81deeab1

# CHANGELOG: to reverse chron order + mdformat
4743ff0d43e04e4cc3e5d8b9e7cd016c0defa36d
4 changes: 3 additions & 1 deletion .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,10 +63,12 @@ jobs:
os: [ubuntu-latest, windows-latest]
arch: [x86_64, aarch64]
cuda_version:
["11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.2", "12.3.2"]
["11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.2", "12.3.2", "12.4.0"]
exclude:
- os: windows-latest # This probably requires arm64 Windows agents
arch: aarch64
- os: windows-latest # The Jimver/cuda-toolkit is action used for Windows builds is not updated for 12.4 yet.
cuda_version: "12.4.0"
- os: ubuntu-latest # Temporary. Takes too long, not ready yet.
arch: aarch64
runs-on: ${{ matrix.os }} # One day, we could run them on native agents. Azure supports this now but it's planned only for Q3 2023 for hosted agents
Expand Down
1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,4 @@ repos:
rev: v1.18.2
hooks:
- id: typos
exclude: ^.*\.hip$
511 changes: 291 additions & 220 deletions CHANGELOG.md

Large diffs are not rendered by default.

39 changes: 3 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,42 +6,9 @@ The `bitsandbytes` library is a lightweight Python wrapper around CUDA custom fu

The library includes quantization primitives for 8-bit & 4-bit operations, through `bitsandbytes.nn.Linear8bitLt` and `bitsandbytes.nn.Linear4bit` and 8-bit optimizers through `bitsandbytes.optim` module.

**Installation for ROCm:**

To install develop version:
```bash
git clone --recurse https://github.com/ROCm/bitsandbytes
cd bitsandbytes
git checkout rocm_enabled
pip install -r requirements-dev.txt
cmake -DCOMPUTE_BACKEND=hip -S . (Use -DBNB_ROCM_ARCH="gfx90a;gfx942" to target specific gpu arch)
make
pip install .
```

For ROCm specific versions:

Install Dependencies:
```bash
# hipblaslt installation needed only for rocm<6.0
apt install hipblaslt
pip install --upgrade pip
pip install einops lion_pytorch accelerate
pip install git+https://github.com/ROCm/transformers.git
```
Install Bitsandbytes:
```bash
git clone --recurse https://github.com/ROCm/bitsandbytes
cd bitsandbytes
# Checkout branch as needed
# for rocm 5.7 - rocm5.7_internal_testing
# for rocm 6.x - rocm6.2_internal_testing
git checkout <branch>
make hip
python setup.py install
```

**For more details, please head to the official documentation page:**
There are ongoing efforts to support further hardware backends, i.e. Intel CPU + GPU, AMD GPU, Apple Silicon. Windows support is quite far along and is on its way as well.

**Please head to the official documentation page:**

**[https://huggingface.co/docs/bitsandbytes/main](https://huggingface.co/docs/bitsandbytes/main)**

Expand Down
26 changes: 0 additions & 26 deletions benchmarking/accuracy/bnb_accuracy.py

This file was deleted.

48 changes: 42 additions & 6 deletions bitsandbytes/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

import torch

from . import research, utils
from .autograd._functions import (
MatmulLtState,
Expand All @@ -12,19 +14,53 @@
matmul_cublas,
mm_cublas,
)
from .backends import register_backend
from .backends.cpu import CPUBackend
from .cextension import lib
from .nn import modules

if lib and lib.compiled_with_cuda:
from .backends import register_backend
from .backends.cuda import CUDABackend
from .optim import adam
# Always register the CPU backend.
register_backend("cpu", CPUBackend())

# Register either CUDA or ROCm backend, if available.
# Only one of these backends can be used at a time, since the torch.device semantics are
# the same for both torch+rocm and torch+cuda (e.g. device name is "cuda")
if torch.cuda.is_available():
# TODO: Consider deferring loading of cextension - should backend class implement that?

if torch.version.cuda:
from .backends.cuda import CUDABackend

register_backend("cuda", CUDABackend())
elif torch.version.hip:
from .backends.rocm import ROCmBackend

register_backend("cuda", ROCmBackend())

# Register MPS backend, if available.
if torch.backends.mps.is_available() and torch.backends.mps.is_built():
from .backends.mps import MPSBackend

register_backend("mps", MPSBackend())

# Register Intel XPU backend, if available.
if hasattr(torch, "xpu") and torch.xpu.is_available():
from .backends.xpu import XPUBackend

register_backend("xpu", XPUBackend())

# TODO: Other potential backends:
# XLA - Google TPU / PJRT runtime
# HPU - Habana / Intel Gaudi
# IPU - Graphcore
# NPU - Ascend
# Note that we may not map 1:1 with a device type, e.g. SYCL, XLA
# In this case, it will be up to each backend to dispatch as needed

register_backend("cuda", CUDABackend())
__pdoc__ = {
"libbitsandbytes": False,
"optim.optimizer.Optimizer8bit": False,
"optim.optimizer.MockArgs": False,
}

__version__ = "0.44.0.dev"
__version__ = "0.43.2.dev"
Loading
Loading