-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Mixtral #259
Comments
Hi @pseudotensor, support is coming! #251 Before the model is supported, in figuring out an effective scaling of its layers. |
Great, looking forward. Love AWQ stuff, always works better than llama.cpp, esp. for heavy use in vLLM. |
That’s why I keep working on it! Admittedly, this is a very time consuming task:
|
Mixtral support is on main now. |
Howdy @casper-hansen Curious what you did with your AWQ process for Mixtral. Do you have your steps documented/coded? For https://huggingface.co/casperhansen/mixtral-instruct-awq. I ask because we checked out many AWQ Mixtrals, e.g.
|
Related?
https://huggingface.co/casperhansen/mixtral-instruct-awq-it1/tree/main
9c3dfa0
Also see:
https://huggingface.co/ybelkada/Mixtral-8x7B-Instruct-v0.1-AWQ
huggingface/transformers#27950
Right now get:
Or is this something that has to be only done in transformers? Sorry I get confused between what is in transformers vs. here.
The text was updated successfully, but these errors were encountered: