Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Mixtral #259

Closed
pseudotensor opened this issue Dec 14, 2023 · 5 comments
Closed

Support Mixtral #259

pseudotensor opened this issue Dec 14, 2023 · 5 comments

Comments

@pseudotensor
Copy link

pseudotensor commented Dec 14, 2023

Related?
https://huggingface.co/casperhansen/mixtral-instruct-awq-it1/tree/main
9c3dfa0

Also see:
https://huggingface.co/ybelkada/Mixtral-8x7B-Instruct-v0.1-AWQ
huggingface/transformers#27950

Right now get:

  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/awq/models/auto.py", line 50, in from_quantized
    model_type = check_and_get_model_type(quant_path, trust_remote_code)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/awq/models/auto.py", line 25, in check_and_get_model_type
    raise TypeError(f"{config.model_type} isn't supported yet.")
TypeError: mixtral isn't supported yet.

Or is this something that has to be only done in transformers? Sorry I get confused between what is in transformers vs. here.

@casper-hansen
Copy link
Owner

Hi @pseudotensor, support is coming! #251 Before the model is supported, in figuring out an effective scaling of its layers.

@pseudotensor
Copy link
Author

Great, looking forward. Love AWQ stuff, always works better than llama.cpp, esp. for heavy use in vLLM.

@casper-hansen
Copy link
Owner

That’s why I keep working on it! Admittedly, this is a very time consuming task:

  1. Make tweak to code
  2. Quantize model (wait 40 minutes)
  3. Measure perplexity
  4. Repeat

@casper-hansen
Copy link
Owner

Mixtral support is on main now.

@pseudotensor
Copy link
Author

pseudotensor commented Feb 14, 2024

Howdy @casper-hansen Curious what you did with your AWQ process for Mixtral. Do you have your steps documented/coded? For https://huggingface.co/casperhansen/mixtral-instruct-awq.

I ask because we checked out many AWQ Mixtrals, e.g.

  • TheBloke/dolphin-2.7-mixtral-8x7b-AWQ : Bad repetition
  • TheBloke/Mixtral-8x7B-Instruct-v0.1-AWQ : Doesn't even start to generate, as others have complained
  • ybelkada/Mixtral-8x7B-Instruct-v0.1-AWQ : Bad repetition

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants