Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support to export gguf q4_0 and q4_1 format #393

Merged
merged 30 commits into from
Jan 8, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
8355347
export gguf
n1ck-guo Dec 24, 2024
dd55003
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 24, 2024
f67219b
q4_0/1 port c++ to python
n1ck-guo Dec 24, 2024
611c4c1
Merge branch 'hengguo/gguf' of https://github.com/intel/auto-round in…
n1ck-guo Dec 24, 2024
ce1c48e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 24, 2024
7ab730b
change to llama.cpp stype and add uint8 store
n1ck-guo Dec 25, 2024
287b5af
abstract
n1ck-guo Dec 25, 2024
49d95a8
merge
n1ck-guo Dec 25, 2024
113532a
update
n1ck-guo Dec 26, 2024
ee66c47
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 26, 2024
d395c6b
fix
n1ck-guo Dec 26, 2024
8b13f1f
Merge branch 'hengguo/gguf' of https://github.com/intel/auto-round in…
n1ck-guo Dec 26, 2024
ce2c346
update
n1ck-guo Dec 30, 2024
8bceb3f
default sequence eval
n1ck-guo Dec 30, 2024
722a1d8
modify by comments
n1ck-guo Dec 30, 2024
8712170
update
n1ck-guo Dec 30, 2024
1aa979a
pylint
n1ck-guo Dec 30, 2024
515160d
clean
n1ck-guo Dec 30, 2024
a064c44
pylint
n1ck-guo Dec 30, 2024
fa2328d
fix
n1ck-guo Dec 30, 2024
7906284
update
n1ck-guo Dec 31, 2024
4261191
Merge branch 'main' into hengguo/gguf
n1ck-guo Dec 31, 2024
e525f97
add ut
n1ck-guo Dec 31, 2024
b0f96a0
add cuda ut
n1ck-guo Dec 31, 2024
c7ec3a5
add requirements
n1ck-guo Dec 31, 2024
79c5c5a
format
n1ck-guo Dec 31, 2024
2720287
code scane
n1ck-guo Dec 31, 2024
db15354
update
n1ck-guo Jan 7, 2025
24a68a9
merge main
n1ck-guo Jan 7, 2025
cb67c1a
update
n1ck-guo Jan 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions auto_round/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,15 @@
import sys

def run_eval():
if "--sequence" in sys.argv:
sys.argv.remove("--sequence")
from auto_round.script.llm import setup_eval_parser, eval_sequence
args = setup_eval_parser()
eval_sequence(args)
else:
if "--non_sequence" in sys.argv:
n1ck-guo marked this conversation as resolved.
Show resolved Hide resolved
sys.argv.remove("--non_sequence")
from auto_round.script.llm import setup_eval_parser, eval
args = setup_eval_parser()
eval(args)
else:
from auto_round.script.llm import setup_eval_parser, eval_sequence
args = setup_eval_parser()
eval_sequence(args)

def run():
if "--eval" in sys.argv:
Expand Down
6 changes: 5 additions & 1 deletion auto_round/script/llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -313,8 +313,12 @@ def tune(args):
from auto_round.utils import logger

if args.format in ["gguf:q4_0", "gguf:q4_1"]:
args.bits = 4
if args.act_bits <= 8:
logger.warning(f"{args.format} not support for activation quantization. Reset act_bits to 16.")
args.act_bits = 16
n1ck-guo marked this conversation as resolved.
Show resolved Hide resolved
if args.group_size != 32:
logger.warning(f"{args.format} not support for group_size: {args.group_size}."
logger.warning(f"{args.format} not support for group_size: {args.group_size}. "
"Reset group_size to 32.")
args.group_size = 32
if args.format.endswith("_0"):
Expand Down
Loading