-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] [!4bit] save_quantized TypeError: cannot pickle 'module' object #47
Comments
@FupsGamer We are going to check this. Assigned to @CSY-ModelCloud Our unit tests only ran for 4bit quants so we need to expand the tests to 8bit. |
@FupsGamer Can you check if you get similar error with 4bit quant? |
checking now |
yeah works for 4bit, seems like its a problem with 8bit |
Thank you for confirming the 8bit issue. We will get to the bottom if this. |
Thanks! let me know if you fix it or if there is anything else I can assist with |
Status Update: src of bug found and fix undergoing testing. This bug affected all non-4bit quantization process. Expect resolution in next 12 hours. |
…49) * fix cannot pickle 'module' object for 8 bit * remove unused import * remove print * check with tuple * revert to len check * add test for 8bit * set same QuantizeConfig * check if it's 4 bit * fix grammar * remove params * it's not a list * set gptqmodel_cuda back * check is tuple * format * set desc_act=True * set desc_act=True * format * format * Refractor fix * desc_act=True --------- Co-authored-by: Qubitium <Qubitium@modelcloud.ai>
@FrederikHandberg Fix merged to main. Please recompile from main and test again. Thanks. |
@Qubitium Thanks! I have tested it, and true there is no longer an error, but it says killed, after the packaging finishes and creates the output dir, but its empty... |
@FrederikHandberg Thats not good news. Can you confirm your python, cuda, and torch versions. I willl have @CSY-ModelCloud reproduce the issue mimicking your setup. |
cuda 11.8 paired with torch 2.1.0 just like the other error, just this one is when I use yesterdays commit. |
@FrederikHandberg v0.9.1 has been released with all our CI unit test passing. Please try it now and let us know. For env with cuda < 12.1 and with bitblas enabled in quantize_config, you will be prompted to manua src compile bitblas. |
Closing this as resolved with 0.9.1 release. If the issue persist, feel free to re-open this issue. |
I am experiencing the same error ( "Killed." ) when trying to run 8 bit quantization on v0.9.7 I see that the 8bit test was removed in #169 - is 8 bit quantization no longer supported? Thanks! |
8bit should still be supported by gptq and gptq v2 format using backend.TritonV2 for inference. Can you provide:
We need full info to check your os level oom. OS killing your process if swap and ram cannot satisfy ram allocation. Vram oom have cuda stacktrace instead. |
@lukehare Please create a new issue and provide the info we asked for so we can properly track and fix your issue. This issue is closed and your error unrelated to original post. |
@Qubitium Unfortunately I get a new error now.
I am using the same config as before and using the checkout 6359c59
|
@FrederikHandberg You have hit 2 issues in the tip/main code. You are trying to quantize 8bit but 4bit only kernel/packing was selected instead. We need to test swiching packing to triton which supports 8bit. Second issue is vllm dependency on main, not official release. Vllm should be optional and should not throw this error. We shall fix both of these issues. I will open 2 new issues regaring this as the two issues are separate and no longer appliable to this old issue. |
* Update model list * Update README.md --------- Co-authored-by: Qubitium-modelcloud <qubitium@modelcloud.ai>
…Cloud#47) (ModelCloud#49) * fix cannot pickle 'module' object for 8 bit * remove unused import * remove print * check with tuple * revert to len check * add test for 8bit * set same QuantizeConfig * check if it's 4 bit * fix grammar * remove params * it's not a list * set gptqmodel_cuda back * check is tuple * format * set desc_act=True * set desc_act=True * format * format * Refractor fix * desc_act=True --------- Co-authored-by: Qubitium <Qubitium@modelcloud.ai>
Describe the bug
When saving the quantized model I get this error
*GPU Info
Here is the script I used
The text was updated successfully, but these errors were encountered: