-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Issue #743, Updated word2vec.n_similarities and test_word2vec.testSimilarities methods #883
Changes from 16 commits
809f2ba
3084f6b
14ce73f
552cd59
a826db2
b77ac47
ec51a48
1288dcf
8bc778c
3398e5a
eea3590
5235bb6
8f3e51c
306b2ec
ec49724
dc4df1c
9285ef9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,14 @@ | ||
Changes | ||
======= | ||
|
||
0.13.3, 2016-09-26 | ||
|
||
* Fixed issue #743 , In word2vec's n_similarity method if atleast one empty list is passed ZeroDivisionError is raised, added test cases in test/test_word2vec.py(@pranay360, #883) | ||
|
||
0.13.2, 2016-08-19 | ||
* export_phrases in Phrases model changed. Fixed issue #794 and added test cases in test/test_phrases.py(@AadityaJ, | ||
[#879](https://github.com/RaRe-Technologies/gensim/pull/879)) | ||
- bigram construction can now support multiple bigrams within one sentence | ||
* wordtopics has changed to word_topics in ldamallet, and fixed issue #764. (@bhargavvader, [#771](https://github.com/RaRe-Technologies/gensim/pull/771)) | ||
|
||
* export_phrases in Phrases model changed. Fixed issue #794 and added test cases in test/test_phrases.py(@AadityaJ, #879) bigram construction can now support multiple bigrams within one sentence | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add to future release 0.13.3 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this change necessary? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. are you talking about fixing the issue? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This pr is about #743. There is no need to change other parts of changelog.md |
||
* wordtopics has changed to word_topics in ldamallet, and fixed issue #764. (@bhargavvader, [#771](https://github.com/RaRe-Technologies/gensim/pull/771)) | ||
- assigning wordtopics value of word_topics to keep backward compatibility, for now | ||
* topics, topn parameters changed to num_topics and num_words in show_topics() and print_topics()(@droudy, [#755](https://github.com/RaRe-Technologies/gensim/pull/755)) | ||
- In hdpmodel and dtmmodel | ||
|
@@ -47,7 +50,7 @@ Changes | |
* Control whether to use lowercase for computing word2vec accuracy. (@alantian, #607) | ||
* Easy import of GloVe vectors using Gensim (Manas Ranjan Kar, #625) | ||
- Allow easy port of GloVe vectors into Gensim | ||
- Standalone script with command line arguments, compatible with Python>=2.6 | ||
- Standalone script with command line arguments, compatible with Python>=2.6 | ||
- Usage: python -m gensim.scripts.glove2word2vec -i glove_vectors.txt -o output_word2vec_compatible.txt | ||
* Add `similar_by_word()` and `similar_by_vector()` to word2vec (@isohyt, #381) | ||
* Convenience method for similarity of two out of training sentences to doc2vec (@ellolo, #707) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1539,9 +1539,13 @@ def n_similarity(self, ws1, ws2): | |
True | ||
|
||
""" | ||
if not(len(ws1) and len(ws2)): | ||
raise ZeroDivisionError('Atleast one of the passed list is empty.') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If input is a list, idiomatic Python is: |
||
v1 = [self[word] for word in ws1] | ||
v2 = [self[word] for word in ws2] | ||
return dot(matutils.unitvec(array(v1).mean(axis=0)), matutils.unitvec(array(v2).mean(axis=0))) | ||
return dot(matutils.unitvec(array(v1).mean(axis=0)), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't use vertical indent in gensim -- change to normal hanging indent (see PEP8: all statements on new lines, one level of indent). |
||
matutils.unitvec(array(v2).mean(axis=0))) | ||
|
||
|
||
def init_sims(self, replace=False): | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't delete from changelog of phrases fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't delete it, I had a previous build in which that entry was not in the changelog, anyway I added it. Thank You for marking out the changes.