Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing fasttext models still not working #2045

Closed
quajak opened this issue May 11, 2018 · 8 comments
Closed

Importing fasttext models still not working #2045

quajak opened this issue May 11, 2018 · 8 comments

Comments

@quajak
Copy link

quajak commented May 11, 2018

from gensim.models.wrappers import FastText
fasttext_model = FastText.load_fasttext_format('wiki-news-300d-1M.vec')
print(fasttext_model("TestTest"))

results in: NotImplementedError: Supervised fastText models are not supported

Alternative approach:
from gensim.models import KeyedVectors
fasttext_model = KeyedVectors.load_word2vec_format('wiki-news-300d-1M.vec')
print(fasttext_model("TestTest"))

results in: "KeyError("word '%s' not in vocabulary" % word)

I would have expected these issues fixed by this update: #1916. Could you please check?

Versions

Linux-4.4.0-124-generic-x86_64-with-Ubuntu-16.04-xenial
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609]
NumPy 1.13.3
SciPy 1.0.1
gensim 3.4.0
FAST_VERSION 1

@piskvorky
Copy link
Owner

CC @manneshiva

@quajak
Copy link
Author

quajak commented Jun 5, 2018

Has this been fixed?

@menshikh-iv
Copy link
Contributor

@quajak
That's correct behavior: NotImplementedError: Supervised fastText models are not supported we really not support dumps of supervised fasttext.
For KeyedVectors - this stored words (not ngrams), word "TestTest" really missing.

@piskvorky
Copy link
Owner

@menshikh-iv but Fasttext should support OOV words -- can you point @quajak to how to load the fasttext model (incl. OOV) properly?

@menshikh-iv
Copy link
Contributor

@quajak you should use FastText.load_fasttext_format but with unsupervised model.

@menshikh-iv
Copy link
Contributor

@quajak more detailed answer (to similar question): piskvorky/gensim-data#26 (comment) (I hope this will be helpful for you).

@shadylpstan
Copy link

@piskvorky did you find anything good to deal with OOV words?

@piskvorky
Copy link
Owner

@shadylpstan see comments above. We also clarified the fastsText loading instructions recently, check out the fastText module docstrings. CC @mpenkov .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants