-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Difference between the PyTorch-converted pre-trained BERT parameters released on Google Drive and the one obtained using HuggingFace conversion script #1
Comments
Hi @todpole3 I found huggingface team has changed the variables names in their "updated" code. Previously, See below. Lines 131 to 138 in b7ce9ad
For the compatibility, please use the old Thanks! Wonseok |
That makes a lot of sense. Thanks for the quick response! |
I am using everyhing from the repo itself but still getting this error:
What can be done? |
It got solved, I used different bert model. |
Can you share that model? Even I'm facing the same issue |
Can we update those changes in this repo also? |
mark |
I tried to get the Pytorch pre-trained BERT checkpoint using the conversion script provided by HuggingFace. The script executed without any problem and I was able to obtain a binary converted file.
However, I noticed a few differences between this file compared with the PyTorch-converted pre-trained BERT parameters released on Google Drive.
First, the two files has different variable naming. The HuggingFace converted file has the prefix
bert.
for each variable and cannot be taken by SQLova directly.I was able to map most variables in these two files by manipulating the naming and verify their equivalence, but I cannot find a mapping of the following tensors in the HuggingFace conversion to the Google Drive release, most of them related to layer normalization.
May I understand what causes the above differences? Is layer normalization removed from the BERT architecture on purpose? Thanks.
The text was updated successfully, but these errors were encountered: