Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify runtime expectations #3381

Merged
merged 4 commits into from
Dec 7, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions docs/src/gallery/tutorials/run_doc2vec_lee.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,9 +215,15 @@ def read_corpus(fname, tokens_only=False):

###############################################################################
# Next, train the model on the corpus.
# If optimized Gensim (with BLAS library) is being used, this should take no more than 3 seconds.
# If the BLAS library is not being used, this should take no more than 2
# minutes, so use optimized Gensim with BLAS if you value your time.
# In the usual case, where Gensim installation found a BLAS library for optimized
# bulk vector operations, this training on this tiny 300 document, ~60k word corpus
# should take just a few seconds. (More realistic datasets of tens-of-millions
# of words or more take proportionately longer.) If for some reason a BLAS library
# isn't available, training uses a fallback approach that takes 60x-120x longer,
# so even this tiny training will take minutes rather than seconds. (And, in that
# case, you should also notice a warning in the logging letting you know there's
# something worth fixing.) So, be sure your installation uses the BLAS-optimized
# Gensim if you value your time.
Copy link
Owner

@piskvorky piskvorky Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A concrete set of instructions for how to actually perform such check would be helpful here. E.g. through the FAST_VERSION attribute.

I'd also personally drop all the brackets, but I realize that's your writing style :)

#
model.train(train_corpus, total_examples=model.corpus_count, epochs=model.epochs)

Expand Down