You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The number of parameters and size of the model currently is calculated from the tensors created after the model is loaded, which in some cases may contain duplicated tensors, resulting in an inaccurate and inconsistent reporting of the model size. To address this, llama_model_n_params and llama_model_size should be modified to return the value as calculated in llama_model_loader::n_elements and n_bytes, which could be stored in llama_model while loading the model.
The text was updated successfully, but these errors were encountered:
Discussed in #10274
The number of parameters and size of the model currently is calculated from the tensors created after the model is loaded, which in some cases may contain duplicated tensors, resulting in an inaccurate and inconsistent reporting of the model size. To address this,
llama_model_n_params
andllama_model_size
should be modified to return the value as calculated inllama_model_loader::n_elements
andn_bytes
, which could be stored inllama_model
while loading the model.The text was updated successfully, but these errors were encountered: