awqint4 to gguf ,ModuleNotFoundError: No module named 'awq.apply_awq' #502

LDLINGLINGLING · 2024-06-14T08:18:15Z

I want to use awq quantize a model, and use llama.cpp convert to gguf. but I followed the tutorial but got an error：Traceback (most recent call last):
File "/root/ld/ld_project/llama.cpp/convert_minicpm.py", line 2516, in
main()
File "/root/ld/ld_project/llama.cpp/convert_minicpm.py", line 2460, in main
from awq.apply_awq import add_scale_weights # type: ignore[import-not-found]
ModuleNotFoundError: No module named 'awq.apply_awq'

my awq version is
autoawq 0.2.5+cu121
autoawq_kernels 0.0.6

LDLINGLINGLING · 2024-06-18T08:48:21Z

mport os
import subprocess
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer

model_path = 'mistralai/Mistral-7B-v0.1'
quant_path = 'mistral-awq'
llama_cpp_path = '/workspace/llama.cpp'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 6, "version": "GEMM" }

model = AutoAWQForCausalLM.from_pretrained(
model_path, **{"low_cpu_mem_usage": True, "use_cache": False}
)
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

model.quantize(
tokenizer,
quant_config=quant_config,
export_compatible=True
)

model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)
print(f'Model is quantized and saved at "{quant_path}"')

GGUF conversion

print('Converting model to GGUF...')
llama_cpp_method = "q4_K_M"
convert_cmd_path = os.path.join(llama_cpp_path, "convert.py")
quantize_cmd_path = os.path.join(llama_cpp_path, "quantize")

if not os.path.exists(llama_cpp_path):
cmd = f"git clone https://github.com/ggerganov/llama.cpp.git {llama_cpp_path} && cd {llama_cpp_path} && make LLAMA_CUBLAS=1 LLAMA_CUDA_F16=1"
subprocess.run([cmd], shell=True, check=True)

subprocess.run([
f"python {convert_cmd_path} {quant_path} --outfile {quant_path}/model.gguf"
], shell=True, check=True)

subprocess.run([
f"{quantize_cmd_path} {quant_path}/model.gguf {quant_path}/model_{llama_cpp_method}.gguf {llama_cpp_method}"
], shell=True, check=True)
this is my code

casper-hansen · 2024-07-02T14:00:52Z

Hi @LDLINGLINGLING. This seems to be a llama.cpp package in your first message. Have you tried the GGUF export from the AutoAWQ documentation and did it succeed?

https://casper-hansen.github.io/AutoAWQ/examples/#gguf-export

LDLINGLINGLING · 2024-07-03T01:20:08Z

I didn't succeed，I followed the instructions in this link https://casper-hansen.github.io/AutoAWQ/examples/#gguf-export, but the error at the top appeared

LDLINGLINGLING · 2024-07-03T01:23:04Z

I now think this operation is meaningless, because I originally thought that awq has high quantization accuracy. Whether converting to gguf can maintain this accuracy, but it should be impossible

hanasay · 2024-07-10T06:25:45Z

I now think this operation is meaningless, because I originally thought that awq has high quantization accuracy. Whether converting to gguf can maintain this accuracy, but it should be impossible

Hi @LDLINGLINGLING ~
It is true that --awq-path was remove by llama.cpp! You can refer from this issue.
ggml-org/llama.cpp#5768

And by the way, I'm occur an error that might similar with this issue, hope someone can help me.

I had already converted an Phi-3-mini-128K model to AWQ.
But when I trying to convert Phi-3-awq model to gguf(by llama.cpp convert_hf_to_gguf.py), I got an error below.

INFO:hf-to-gguf:Loading model: Phi-3-mini-128k-instruct-AWQ
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.vocab:Setting special token type bos to 1
INFO:gguf.vocab:Setting special token type eos to 32000
INFO:gguf.vocab:Setting special token type unk to 0
INFO:gguf.vocab:Setting special token type pad to 32000
INFO:gguf.vocab:Setting add_bos_token to False
INFO:gguf.vocab:Setting add_eos_token to False
INFO:gguf.vocab:Setting chat_template to {% for message in messages %}{% if message['role'] == 'system' %}{{'<|system|>
' + message['content'] + '<|end|>
'}}{% elif message['role'] == 'user' %}{{'<|user|>
' + message['content'] + '<|end|>
'}}{% elif message['role'] == 'assistant' %}{{'<|assistant|>
' + message['content'] + '<|end|>
'}}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ '<|assistant|>
' }}{% else %}{{ eos_token }}{% endif %}
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model part 'model.safetensors'
INFO:hf-to-gguf:output.weight,             torch.float16 --> F16, shape = {3072, 32064}
INFO:hf-to-gguf:token_embd.weight,         torch.float16 --> F16, shape = {3072, 32064}
INFO:hf-to-gguf:blk.0.attn_norm.weight,    torch.float16 --> F32, shape = {3072}
Traceback (most recent call last):
  File "/home/matt/work/llama.cpp/convert_hf_to_gguf.py", line 3547, in <module>
    main()
  File "/home/matt/work/llama.cpp/convert_hf_to_gguf.py", line 3541, in main
    model_instance.write()
  File "/home/matt/work/llama.cpp/convert_hf_to_gguf.py", line 330, in write
    self.write_tensors()
  File "/home/matt/work/llama.cpp/convert_hf_to_gguf.py", line 267, in write_tensors
    for new_name, data in ((n, d.squeeze().numpy()) for n, d in self.modify_tensors(data_torch, name, bid)):
  File "/home/matt/work/llama.cpp/convert_hf_to_gguf.py", line 234, in modify_tensors
    return [(self.map_tensor_name(name), data_torch)]
  File "/home/matt/work/llama.cpp/convert_hf_to_gguf.py", line 185, in map_tensor_name
    raise ValueError(f"Can not map tensor {name!r}")
ValueError: Can not map tensor 'model.layers.0.mlp.down_proj.qweight'

The error said that it cannot map through the define layer.
I was thinking, is it possible the error occur by the layer define to a new namemodel.layers.0.mlp.down_proj.qweight, but not as the original name model.layers.0.mlp.down_proj.weight?

If that so, how do I modify it?

sorry for bad English, but hope someone can help. ;-;

BR, Matt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

awqint4 to gguf ,ModuleNotFoundError: No module named 'awq.apply_awq' #502

awqint4 to gguf ,ModuleNotFoundError: No module named 'awq.apply_awq' #502

LDLINGLINGLING commented Jun 14, 2024

LDLINGLINGLING commented Jun 18, 2024 •

edited

Loading

casper-hansen commented Jul 2, 2024

LDLINGLINGLING commented Jul 3, 2024

LDLINGLINGLING commented Jul 3, 2024

hanasay commented Jul 10, 2024 •

edited

Loading

awqint4 to gguf ,ModuleNotFoundError: No module named 'awq.apply_awq' #502

awqint4 to gguf ,ModuleNotFoundError: No module named 'awq.apply_awq' #502

Comments

LDLINGLINGLING commented Jun 14, 2024

LDLINGLINGLING commented Jun 18, 2024 • edited Loading

GGUF conversion

casper-hansen commented Jul 2, 2024

LDLINGLINGLING commented Jul 3, 2024

LDLINGLINGLING commented Jul 3, 2024

hanasay commented Jul 10, 2024 • edited Loading

LDLINGLINGLING commented Jun 18, 2024 •

edited

Loading

hanasay commented Jul 10, 2024 •

edited

Loading