Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: missing hparam type_vocab_size #32

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

FFengIll
Copy link

@FFengIll FFengIll commented Sep 14, 2023

  • type_vocab_size is also a hparam (can not use const as 2).
  • so does the converter.

@FFengIll
Copy link
Author

FFengIll commented Sep 14, 2023

Here is some of the model upon bert DO NOT use type_vocab_size=2 but type_vocab_size=1 (like e5).

https://huggingface.co/intfloat/multilingual-e5-base/blob/main/config.json#L25

@skeskinen
Copy link
Owner

Hi, This seems like a good change.

But surely bert.cpp:358 also needs to be changed? Where the hparams are read from the model file

@FFengIll
Copy link
Author

FFengIll commented Sep 19, 2023

Hi, This seems like a good change.

But surely bert.cpp:358 also needs to be changed? Where the hparams are read from the model file

sure, I will add it.


model.layers.resize(n_layer);

model.word_embeddings = ggml_new_tensor_2d(ctx, wtype, n_embd, n_vocab);
model.token_type_embeddings = ggml_new_tensor_2d(ctx, wtype, n_embd, 2);
model.token_type_embeddings = ggml_new_tensor_2d(ctx, wtype, n_embd, n_vocab_size);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skeskinen
here is a change since the tensor is related to n_vocab_size.
for many case, it is 2, but not a const.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -23,6 +23,7 @@ struct bert_hparams
int32_t n_intermediate = 1536;
int32_t n_head = 12;
int32_t n_layer = 6;
int32_t n_vocab_size = 2;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a break change since new field.
I believe some others will happen in future.
So we may try to shift into GGUF.

https://github.com/philpax/ggml/blob/gguf-spec/docs/gguf.md

@@ -364,6 +365,7 @@ struct bert_ctx * bert_load_from_file(const char *fname)
fin.read((char *)&hparams.n_intermediate, sizeof(hparams.n_intermediate));
fin.read((char *)&hparams.n_head, sizeof(hparams.n_head));
fin.read((char *)&hparams.n_layer, sizeof(hparams.n_layer));
fin.read((char *)&hparams.n_vocab_size, sizeof(hparams.n_vocab_size));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so does here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants