Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix method estimate_memory from gensim.models.FastText & huge performance improvement. Fix #1824 #1916

Merged
merged 22 commits into from
Mar 1, 2018
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
3db9c63
Cythonize fasttext.ft_hash for 100x performance improvement
jbaiter Jan 3, 2018
9f3428a
Cythonize fasttext.compute_ngrams for 2x performance improvement
jbaiter Jan 3, 2018
51a1a6e
Reduce fasttext memory usage by computing ngrams on the fly
jbaiter Jan 3, 2018
f467ab9
Fix compute_ngrams for Python 2
jbaiter Feb 22, 2018
783114a
Merge branch 'develop' into fasttext-optimization
jbaiter Feb 22, 2018
5c576ad
Store OOV vec in variable for more informative assertion error in tes…
jbaiter Feb 22, 2018
0a5912c
Revert all changes to fasttext_wrapper
jbaiter Feb 22, 2018
9a36b08
Fix indentation for multi-line expressions
jbaiter Feb 23, 2018
764071b
Rename utils_any2vec_fast to _utils_any2vec
jbaiter Feb 23, 2018
722cdda
Merge remote-tracking branch 'upstream/develop' into fasttext-optimiz…
jbaiter Feb 23, 2018
c6f347e
fasttext: Cache ngram buckets for words during training
jbaiter Feb 26, 2018
1d86111
Remove last occurences of wv.ngrams_word and wv.ngrams
jbaiter Feb 26, 2018
6aaab0a
fasttext: use buckets_word cache also for non-Cython training
jbaiter Feb 28, 2018
85679ed
fasttext: Add buckets_ngram size to memory estimate
jbaiter Feb 28, 2018
e574e90
fasttext: Don't store buckets_word with the model
jbaiter Feb 28, 2018
2a090c6
fasttext: Use smaller model for test_estimate_memory
jbaiter Feb 28, 2018
33968dc
fasttext: Fix pure python training code
jbaiter Feb 28, 2018
76a0675
fasttext: Fix asserts for test_estimate_memory
jbaiter Feb 28, 2018
0a2ae3c
fasttext: Fix typo and style errors
jbaiter Feb 28, 2018
0fe0f80
fasttext: Simplify code as per @jayantj's review
jbaiter Feb 28, 2018
7cb46e3
Update MANIFEST.in and documentation with utils_any2vec implementations
jbaiter Feb 28, 2018
dcc0857
last fixes (add option for cython compiler, fix descriptions, etc)
menshikh-iv Mar 1, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ include gensim/models/doc2vec_inner.c
include gensim/models/doc2vec_inner.pyx
include gensim/models/fasttext_inner.c
include gensim/models/fasttext_inner.pyx
include gensim/models/_utils_any2vec.c
include gensim/models/_utils_any2vec.pyx
include gensim/corpora/_mmreader.c
include gensim/corpora/_mmreader.pyx
include gensim/_matutils.c
Expand Down
2 changes: 2 additions & 0 deletions docs/src/apiref.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ Modules:
models/coherencemodel
models/basemodel
models/callbacks
models/utils_any2vec
models/_utils_any2vec
models/wrappers/ldamallet
models/wrappers/dtmmodel
models/wrappers/ldavowpalwabbit.rst
Expand Down
9 changes: 9 additions & 0 deletions docs/src/models/_utils_any2vec.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
:mod:`models._utils_any2vec` -- Cython utils for any2vec models
===============================================================

.. automodule:: gensim.models._utils_any2vec
:synopsis: Cython utils for any2vec models
:members:
:inherited-members:
:undoc-members:
:show-inheritance:
9 changes: 9 additions & 0 deletions docs/src/models/utils_any2vec.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
:mod:`models.utils_any2vec` -- Utils for any2vec models
=======================================================

.. automodule:: gensim.models.utils_any2vec
:synopsis: Utils for any2vec models
:members:
:inherited-members:
:undoc-members:
:show-inheritance:
Loading