PEFT-SAM (LoRA, DoRA, QLoRA, and More)

Fine-tune SAM (Segment Anything Model) with Huggingface's Parameter-Efficient Fine-Tuning (PEFT) and Trainer using techniques such as QLoRA, DoRA, and more.

Overview

This project is designed to fine-tune the SAM model on the COCO dataset format using approaches like LoRA, QLoRA, and DoRA. It leverages Huggingface's Transformers library, PEFT, and the bitsandbytes library for efficient training. Works with SAM models compatible with the Transformers library (from transformers import SamProcessor, SamModel).

Key Findings from my Experiments:

I performed experiments on the TrashCan 1.0 dataset to demonstrate the effectiveness of PEFT techniques for fine-tuning SAM:

Model	mIoU (%)	mF1 (%)	Inference (GPU, ms)
SAM-H (0-shot)	69	80	143.64
SlimSAM (0-shot)	67	79	28.70
SAM-B(Vanilla)	75.56	85.28	30.95
Sam-B (DoRA)	~82.86 (±0.66)	~90.26 (±0.42)	30.95
SlimSAM (LoRA)	~81.46 (±0.76)	~89.35 (±0.52)	28.70
SlimSAM (DoRA)	~81.82 (±0.87)	~89.59 (±0.58)	28.70

Note: In CPU-only scenarios, SlimSAM is up to ~9.8× faster than SAM-B.

Ran on RTX 3090

Performance Boost with PEFT: DoRA and LoRA significantly outperformed full (Vanilla) fine-tuning, achieving higher accuracy with fewer trainable parameters.
Quantization Benefits Depend on Model Size: While I explored quantization techniques like QLoRA and QDoRA, I found that for smaller SAM models like SlimSAM, the memory savings during training were not substantial. Quantization might offer more significant advantages for larger base models like SAM-H. In general I recommend focusing on LoRA and DoRA as it's faster than the Q variants and only use quntized variants if you experiance OOM issues.

Key Features

PEFT (Parameter-Efficient Fine-Tuning): Uses the PEFT library to fine-tune with fewer parameters, choose LoRA, DoRA, Adapters, etc.
Bitsandbytes: Utilizes bitsandbytes for quantization, enabling efficient 4-bit training, can be used for QLoRA e.g.
COCO Dataset Format: Works with datasets in the COCO format, this is an adoption from the Lightning-SAM repo, slightly changed to work with the Transformers library. I've also added the option for "no prompt" for auto-mask training.
SDPA (Scaled Dot-Product Attention): Uses Scaled Dot-Product Attention (SDPA) to speed up training and inference. My Pull Request adding SDPA for SAM has now been merged into the main Transformers library.

Installation

Clone the repo:

git clone https://github.com/MagnusS0/qlora-sam.git
cd qlora-sam

Install dependencies using Poetry:
```
poetry install
```

Training

To train the model, run the train.sh, adjust the paths to your dataset and model.

chmod +x train.sh

./train.sh

Testing

To test the model, run the test.sh, adjust the paths to your dataset and model.

chmod +x test.sh

./test.sh

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
test.sh		test.sh
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PEFT-SAM (LoRA, DoRA, QLoRA, and More)

Overview

Key Findings from my Experiments:

Key Features

Installation

Training

Testing

About

Releases

Packages

Languages

License

MagnusS0/QLoRA-SAM

Folders and files

Latest commit

History

Repository files navigation

PEFT-SAM (LoRA, DoRA, QLoRA, and More)

Overview

Key Findings from my Experiments:

Key Features

Installation

Training

Testing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages