【bloom】convert_checkpoint.py local variable 'int8_weights' referenced before assignment #741

scarydemon2 · 2023-12-26T05:58:01Z

I follow the readme :

Build model with both INT8 weight-only and INT8 KV cache enabled

python convert_checkpoint.py --model_dir ./bloom/560m/
--dtype float16
--int8_kv_cache
--use_weight_only --output_dir ./bloom/560m/trt_ckpt/int8/1-gpu/
trtllm-build --checkpoint_dir ./bloom/560m/trt_ckpt/int8/1-gpu/
--use_gemm_plugin float16
--use_gpt_attention_plugin float16
--output_dir ./bloom/560m/trt_engines/int8/1-gpu/

and my script is

python convert_checkpoint.py --model_dir ./Bloomz_QA+alpaca_gpt4_zh+lima_V3
--dtype float16
--int8_kv_cache
--use_weight_only --output_dir ./Bloomz_QA+alpaca_gpt4_zh+lima_V3/trt_ckpt/int8/1-gpu/
trtllm-build --checkpoint_dir ./Bloomz_QA+alpaca_gpt4_zh+lima_V3//trt_ckpt/int8/1-gpu/
--use_gemm_plugin float16
--use_gpt_attention_plugin float16
--output_dir ./Bloomz_QA+alpaca_gpt4_zh+lima_V3/trt_engines/int8/1-gpu/

and I got

Traceback (most recent call last):
File "/workspace/TensorRT-LLM/examples/bloom/convert_checkpoint.py", line 899, in
weights = convert_hf_bloom(
File "/workspace/TensorRT-LLM/examples/bloom/convert_checkpoint.py", line 668, in convert_hf_bloom
np.array([1.0 / int8_weights['scale_y_quant_orig']],
UnboundLocalError: local variable 'int8_weights' referenced before assignment

The code in convert_checkpoint.py shows that if use_smooth_quant ==False, the int8_weights will not been calculate.

nv-guomingz · 2023-12-26T17:30:50Z

Hi @scarydemon2 thanks for reporting this issue. The fixing for this issue had been upstreamd to main branch. Please have a try.

scarydemon2 closed this as completed Dec 26, 2023

scarydemon2 reopened this Dec 26, 2023

nv-guomingz self-assigned this Dec 26, 2023

nv-guomingz added Low Precision Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers labels Dec 26, 2023

kaiyux mentioned this issue Dec 27, 2023

Update TensorRT-LLM main branch #754

Merged

hello-11 closed this as completed Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【bloom】convert_checkpoint.py local variable 'int8_weights' referenced before assignment #741

【bloom】convert_checkpoint.py local variable 'int8_weights' referenced before assignment #741

scarydemon2 commented Dec 26, 2023 •

edited

Loading

nv-guomingz commented Dec 26, 2023 •

edited

Loading

【bloom】convert_checkpoint.py local variable 'int8_weights' referenced before assignment #741

【bloom】convert_checkpoint.py local variable 'int8_weights' referenced before assignment #741

Comments

scarydemon2 commented Dec 26, 2023 • edited Loading

Build model with both INT8 weight-only and INT8 KV cache enabled

and my script is

and I got

nv-guomingz commented Dec 26, 2023 • edited Loading

scarydemon2 commented Dec 26, 2023 •

edited

Loading

nv-guomingz commented Dec 26, 2023 •

edited

Loading