-
Notifications
You must be signed in to change notification settings - Fork 334
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactor: GGUF metadata tokenizer (#389)
* tests: Use `cfg(test)` attribute to avoid `dead_code` warnings Proper way to opt-out of the dead code warnings is annotate the test module as purely for testing. * tests: DRY codec test cases Remove the repetitive noise to common functions under test. This also addresses a bug fix for the encode test case where the upstream encoder/decoder calls flip the meaning of the `bool` for handling special tokens. * chore: Add `TODO` note regarding test remote data dependency * refactor: DRY metadata extraction Retrieving metadata items from their hashmap `Value` enum into primitive types with error handling is very verbose and noisy. Use traits to abstract all that away. This could also benefit usage in models `from_gguf()` methods. Meanwhile the unigram tokenizer has unified the special token handling at the end by keeping `unk` as a `u32` and only casting it to `usize` when actually needed. * refactor: Extract `unigram` tokenizer out of match statement The special token strings are also being created in the tokenizer now. A bit awkward, but unclear why only `unk` was an option, presumably the `bos` and `eos` may also need similar treatment to `unk`? * chore: `rustfmt` adjustments + notes * refactor: GGUF Unigram Tokenizer Vocab construction * Update gguf_tokenizer.rs * chore: Rename `MetadataContext` => `ContentMetadata` * chore: `verify_sanity_gguf()` => `verify_arch()` This is a partial change, the method will be changed over in subsequent commits. * chore: Expand GGUF `Value` enum types support For the quantized models to leverage. Additionally changes helper methods over to `anyhow::Error`. * refactor: GGUF metadata - `quantized_llama.rs` * refactor: GGUF metadata - `quantized_phi2.rs` * refactor: GGUF metadata - `quantized_phi3.rs` * refactor: GGUF metadata - X-LoRA llama + phi3 - `get_gguf_max_seq_len` dropped as not compatible with current approach. - Non-XLora models export their `PropsGGUF` as the share the exact same code as their X-LoRA versions for this metadata. - Switch from `eprintln!` to `warn!` for required props check. - Additional cleanup. * tests: Skip encoder test case for special tokens When the encoder adds special tokens (`true`), it also runs the decoder with `false` to not skip processing special tokens. There is a mismatch in the output between HF and GGUF tokenizers during the decode, where the GGUF is missing an initial `<s> `. Advice is to skip this test case for now. * Update mistralrs-core/src/pipeline/gguf_tokenizer.rs * refactor: Use convenience enums for Decoder and Normalizer inputs These two enum types are similar to their equivalent upstream `*Wrapper` types, except instead of wrapping individual structs, they take a tuple of any args and use a `TryFrom` impl to recreate the actual types (`new()` + any error handling) and then convert to the wrapped enum variant. Shifts away that noise and inconsistent API away so that the tokenizer methods are easier to grok. * chore: Add a tokenizer builder workaround Similar to the enum workaround. As the upstream builder is awkward to use, an alternative one is implemented to improve the DX. The enum conversion to upstream wrapper types is handled in the builder now, simplifying usage in a tokenizer method. * chore: `MetadataContent` path_prefix to `&str` * tests: Skip Decoder with special tokens This test fails presently. It is due to the mismatch of the HF tokenizer vs GGUF tokenizer used. * fix: Decoder tests This special character looks very similar to `_` but it is not. This was a mistake I introduced when converting to local enums approach. * tests: Replace web request with hard-coded string * docs: Add maintenance reference comment Added context for this particular configuration.
- Loading branch information
1 parent
44e8a22
commit 8b2d092
Showing
11 changed files
with
619 additions
and
280 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.