Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed typo in quantization_config.py #29766

Merged
merged 1 commit into from
Mar 21, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/transformers/utils/quantization_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ class BitsAndBytesConfig(QuantizationConfigMixin):
This flag runs LLM.int8() with 16-bit main weights. This is useful for fine-tuning as the weights do not
have to be converted back and forth for the backward pass.
bnb_4bit_compute_dtype (`torch.dtype` or str, *optional*, defaults to `torch.float32`):
This sets the computational type which might be different than the input time. For example, inputs might be
This sets the computational type which might be different than the input type. For example, inputs might be
fp32, but computation can be set to bf16 for speedups.
bnb_4bit_quant_type (`str`, *optional*, defaults to `"fp4"`):
This sets the quantization data type in the bnb.nn.Linear4Bit layers. Options are FP4 and NF4 data types
Expand Down
Loading