-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using pre-trained word2vec models in doc2vec #1338
Comments
You can manually patch-up a model to insert word-vectors from elsewhere before training. The existing I personally don't think the case for such re-use is yet strong – indeed in some often top-performing Doc2Vec training modes (like pure PV-DBOW), input-word-vectors aren't trained or used at all, so loading them would be completely superfluous. You can see some discussion of related issues, including links to messages elsewhere, in the Github issue thread: #1270 (comment) |
This fork supports the latest gensim 3.8, which can train doc2vec model with pretrained word2vec. |
As per above, I think the evidence for the benefit of such a technique is muddled. Also: it should be possible simply by poking/prodding a standard model at the right points between instantiation and training – without any major changes or new-parameters to the relevant models, or using a forked version of gensim (that will drift further away from other changes/fixes over time). |
Is there a practical way of using pre-trained word2vec models in doc2vec?
There is a forked version of Gensim that does it but it is pretty old.
Referenced here: https://github.com/jhlau/doc2vec
Forked Gensim here: https://github.com/jhlau/gensim
Otherwise I would like to add this feature as jhlau did and merge it back.
The text was updated successfully, but these errors were encountered: