-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing vectors in Spacy models #1341
Comments
That's interesting. You can always add more vectors by assigning to I'm not sure whether there was a problem with the way the vector data was pruned. This was done quite some time ago, so it's possible there was a mistake. You might check whether the In spaCy 2 (installable via |
I've done some experiments with I can confirm that all of the words @znat lists are in the vocabulary of the model (there is a lexeme for each word) but none of them have vectors. In It's not catastrophic but as @znat points out the SpaCy 1.x documentation does say that the default model has vectors for a vocabulary of 1 million words and In SpaCy 2, the situation is even more confusing in that the In general, the documentation is very ambiguous about what is included in each model (global vocabulary size, how many words have vectors, etc). The release notes and documentation for both |
@nsecord Thanks a lot for your detailed feedback – I definitely agree. I've opened up a v2.0 issue on this subject in #1457 and will merge these two issues so we can keep a better overview of what's still left to do before we can retrain the stable v2.0 models. #1457 also includes some suggestions – e.g. reading the vector specs off the model automatically, and including them in model's meta data. In the stable v2.0, the |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
But vectors are missing from words I would expect to be in that million.
Model info:
lang: en
name: core_web_md
license: CC BY-SA 3.0
author: Explosion AI
url: https://explosion.ai
source: /Users/nzylber1/anaconda/envs/rasa/lib/python2.7/site-packages/en_core_web_md/en_core_web_md-1.2.1
version: 1.2.1
spacy_version: >=1.7.0,<2.0.0
email: contact@explosion.ai
The text was updated successfully, but these errors were encountered: