-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gguf-py: Support identity operation in TensorNameMap #3095
gguf-py: Support identity operation in TensorNameMap #3095
Conversation
Make try_suffixes keyword param optional.
If I understand #3093 (comment) correctly, we don't actually need this. |
It's not specifically for that, that pull is just something that made me aware of the current less than ideal behavior and to prevent other people from having the same issue in the future. As far as I understand it, the name mapping stuff is supposed to just let translate the tensor names in a model to the GGUF convention without actually caring about the details. The fact that if it's already correct you can't is unintuitive. The pull also fixes an issue where you had to specify the suffixes to try no matter what which is weird and not what I intended. I just stupidly forgot to specify the default value for the keyword arguments. Not saying it's of earthshaking importance or anything, it just makes the interface more ergonomic and intuitive to use. I think it makes sense to do that kind of thing when it's simple/easy and doesn't break any existing API usage. |
Is there a clear use case that would take advantage of the identity mapping? I thought the conversion to GGUF was only done once when it's working correctly. |
By use case do you mean something that can't be done without that behavior? If so, no. But one could say the same thing about a lot of things. There isn't necessarily a clear use case for specifying defaults for keyword arguments either. These are just things that make working with the code easier and more ergonomic, make it less likely for people to run into issues where the behavior is unintuitive. It can also enable writing more general code instead of having to special case stuff.
This change is in the GGUF Python package, not a conversion script. It's a public API in a public package so people can use it for whatever they want, not necessarily just conversion. Presumably the GGUF Python package will also support reading GGUF files eventually. People could also find ways for the package to solve problems even without caring about GGUF files specifically. Just as example, the tensor mapping stuff could be useful to anyone that wants to deal with different flavors of models without having to figure out all the tensor types and write their own mapping stuff. |
Okay. |
Make try_suffixes keyword param optional.
edit: Just to make it a bit more clear what this is trying to do:
TensorNameMap
is used to map the assorted naming conventions for types of tensors in various models to GGUF convention. A HuggingFace LLaMA model might call the attention norm tensormodel.layers.1.input_layernorm
, the.pth
version might call it something different and so on. In GGUF it's calledblk.1.attn_norm
.However, currently
TensorNameMap
only maps the non-GGUF names to the GGUF name. If you already have the GGUF name and try to map, it'll fail. This pull just adds an entry for the GGUF-style name to the list so trying to map a name that's already correct is a no-op.Before:
After:
This also fixes an issue where you had to specify
try_suffixes
toTensorNameMap.get_name
and friends. It just sets the default value for the keyword param to an empty sequence (I meant to do this originally but apparently I messed it up).