Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE] SUPPORTED_SHARDS property to BaseQuantLinear #69

Merged
merged 9 commits into from
Jun 26, 2024

Conversation

PZS-ModelCloud
Copy link
Contributor

No description provided.

@Qubitium Qubitium changed the title add supports_sharded [CORE] SUPPORTED_SHARDS property to BaseQuantLinear Jun 26, 2024
@Qubitium Qubitium merged commit af132c9 into ModelCloud:main Jun 26, 2024
DeJoker pushed a commit to DeJoker/GPTQModel that referenced this pull request Jul 19, 2024
…ng only (ModelCloud#69)

* move ruff.toml to format folder

* only use exllamav2 to load model. v1 is only be used in packing

* remove disable_exllamav2 start from from_quantized

* fix packing was using exllama(v1)

* add dynamically_import_QuantLinear_for_packing

* fix v1 was not imported

* commented max input length test

* fix v1 v2 logic

* fix import after merge

* default disable exllama v1

* remove remaining disable_exllamav2

* fix libc10.so was not loaded

* push format
DeJoker pushed a commit to DeJoker/GPTQModel that referenced this pull request Jul 19, 2024
* add supports_sharded

* rename SUPPORTED_XXX -> SUPPORTS_XXX

* mod clean up

* mod clean up

* revert SUPPORTS_MODELS change

* revert SUPPORTS_XXX -> SUPPORTED_XXX

* rename SUPPORTED_SHARDED -> SUPPORTED_SHARDS

* mod clean up
@PZS-ModelCloud PZS-ModelCloud deleted the add_supports_sharded branch August 1, 2024 01:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants