-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Vectorize word2vec.predict_output_word for speed (#3153)
* [Fix] gensim/models/word2vec.py: in method predict_output_word, changed a call to sum to numpy.sum to gain performance. * [Feat] gensim.models.word2vec.Word2Vec.predict_output_word: added possibility for the user to input a list of word indices as parameter 'context' instead of a list of words. * Word2Vec.predict_output_word: Changed handling of ints and strs, trying to trying to make it more compact and versatile. * Fixed docstring of predict_output_word. * Simplified `predict_output_word` changes. * Retained the suggested `sum`->`np.sum` replacement, which has been tested to yield significant runtime gains. * Dropped unnecessary type/value checks that are already run when calling the `KeyedVectors.__isin__` dunder method. * Corrected the docstring to accurately document the supported inputs (which were already compatible prior to the PR this commit is a part of). * Added tests for gensim.Word2Vec.predict_output_word() when context contains ints. * Update CHANGELOG.md * update sbt install step Co-authored-by: Mathis <[email protected]> Co-authored-by: Paul Andrey <[email protected]> Co-authored-by: Mathis Demay <[email protected]> Co-authored-by: Michael Penkov <[email protected]>
- Loading branch information
1 parent
a93067d
commit b287fd8
Showing
4 changed files
with
18 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters