Update Benchmarks and Documentation for GraniteCausalLM #86

fabianlim · 2024-09-26T05:05:46Z

In this PR we update the benchmarks for GraniteCausalLM

in addition, the README.md is also updated to describe how a new model can be added in the future
NOTE: we did not update the GPTQ results in the bench. will do this possibly at a later time.

Note this PR requires the following dependency updates

transformers>=4.45: for GraniteCausalLM
accelerate>=0.34.1: required for transformers>=4.45 if GraniteCausalLM is needed.
trl > 0.11.1: when using baseline bnb, requires this fix for a bug that was introduced in transformers==4.45 Fix Inconsistency with IsShardedQLoRA Setting huggingface/trl#2089
bitsandtbyes==0.43.3: it seems that the later versions give segmentation fault errors

Known issues with quant peft

single GPU w/o FOAK
single GPU w FOAK -> ~~fused lora dequant problem~~ (this is an issue with the compiled binaries in bitsandbytes 0.43.3, that is not compatible with maybe the CUDA toolkit or torch version)
multi GPU w/o FOAK -> ~~rank 1 stuck at prepare_model~~ (this is resolved by disabling low_cpu_mem_mode)
multi GPU w FOAK -> ~~meta device problem~~ (see 2 in Distributed Training Problems for QLoRA models with Transformers pre-release 4.45 #83) (this is resolved by disabling low_cpu_mem_mode)
bad loss with BNB+FOAK -> (resolved by updating lora fused ops to support bias)

Performance

Overall impressive improvements with kernels.

FULL FT

PEFT

Quantized Peft (BNB)

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

wynterl · 2024-09-27T02:01:30Z

awesome, great results @fabianlim

raghukiran1224 · 2024-09-27T02:13:31Z

Indeed, awesome results @fabianlim !

fabianlim · 2024-09-27T03:32:13Z

@wynterl @raghukiran1224 the loss for BNB + fused ops looks problematic. ~~Needs more debugging~~, Ok i found that its because Granite has a bias in the Linear, but the FOAK kernels do not support bias. This just requires some minor (but tedious) modifications

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

update scenarios to include powerlm and update readme

76b6685

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

fabianlim force-pushed the granite-foak branch from 0e23809 to 76b6685 Compare September 26, 2024 13:44

fabianlim added 3 commits September 26, 2024 14:48

fix configs

12375d7

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

fix typo

978d48b

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

fix low_cpu_mem intro by 4.45

0258544

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

fabianlim force-pushed the granite-foak branch from a035913 to 0258544 Compare September 26, 2024 17:24

fabianlim added 2 commits September 27, 2024 09:12

improvements to is_local_dist_rank patching

3993b8c

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

update bnb and gptq fast lora to support bias

e87f351

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

fabianlim force-pushed the granite-foak branch from 36c7fa7 to e87f351 Compare September 28, 2024 02:59

fabianlim added 2 commits September 28, 2024 08:33

fmt and lint

177d6d7

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

add bench (no gptq) and adjust requirements.txt

426a4a5

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

fabianlim force-pushed the granite-foak branch from 0a451c1 to 426a4a5 Compare September 30, 2024 03:44

fabianlim merged commit ff14d4f into main Oct 1, 2024
6 checks passed

fabianlim deleted the granite-foak branch October 1, 2024 03:57

This was referenced Oct 8, 2024

Distributed Training Problems for QLoRA models with Transformers pre-release 4.45 #83

Closed

When HF Memory Metrics Disabled, the Benchmark CSV is Corrupted. #75

Closed

Quickfix: Accelerate YAML and LoRA Fused Ops #92

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Benchmarks and Documentation for GraniteCausalLM #86

Update Benchmarks and Documentation for GraniteCausalLM #86

fabianlim commented Sep 26, 2024 •

edited

Loading

wynterl commented Sep 27, 2024

raghukiran1224 commented Sep 27, 2024

fabianlim commented Sep 27, 2024 •

edited

Loading

Update Benchmarks and Documentation for GraniteCausalLM #86

Update Benchmarks and Documentation for GraniteCausalLM #86

Conversation

fabianlim commented Sep 26, 2024 • edited Loading

Performance

wynterl commented Sep 27, 2024

raghukiran1224 commented Sep 27, 2024

fabianlim commented Sep 27, 2024 • edited Loading

fabianlim commented Sep 26, 2024 •

edited

Loading

fabianlim commented Sep 27, 2024 •

edited

Loading