-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Issue #743: Added exception handling for n_similarity method #882
Conversation
Fix: Word2vec n_similarity returning numpy matrix instead of float with empty list issue #743 in n_similarity() method, if atleast one empty list is passed, it raises ZeroDivisionError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test and changelog as well
v2 = [self[word] for word in ws2] | ||
return dot(matutils.unitvec(array(v1).mean(axis=0)), matutils.unitvec(array(v2).mean(axis=0))) | ||
try: | ||
assert(len(ws1)>0 and len(ws2)>0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use pep8
v2 = [self[word] for word in ws2] | ||
return dot(matutils.unitvec(array(v1).mean(axis=0)), matutils.unitvec(array(v2).mean(axis=0))) | ||
except AssertionError: | ||
raise ZeroDivisionError("Atleast one of the passed list is empty.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please raise ZeroDivisionError directly above. No need for assert.
replace try and catch AssertionError with simple if and else.
Added new test cases in testSimilarities method which makes sure whether ZeroDivisionError is raised if atleast one empty list is passed to word2vec.n_similarities method
@@ -1,8 +1,9 @@ | |||
Changes | |||
======= | |||
|
|||
0.13.2, 2016-08-19 | |||
0.13.2, 2016-09-25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look correct -- past release dates cannot change retroactively.
v1 = [self[word] for word in ws1] | ||
v2 = [self[word] for word in ws2] | ||
return dot(matutils.unitvec(array(v1).mean(axis=0)), matutils.unitvec(array(v2).mean(axis=0))) | ||
if len(ws1) > 0 and len(ws2) > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Python, emptiness of collections is tested simply with if x:
if len(ws1) > 0 and len(ws2) > 0: | ||
v1 = [self[word] for word in ws1] | ||
v2 = [self[word] for word in ws2] | ||
return dot(matutils.unitvec(array(v1).mean(axis=0)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use vertical indent; please use hanging indent (see PEP8 example).
matutils.unitvec(array(v2).mean(axis=0))) | ||
else: | ||
raise ZeroDivisionError('Atleast one of the passed list is empty.') | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return
makes no sense after a raise
...
@@ -370,9 +370,12 @@ def testSimilarities(self): | |||
model = word2vec.Word2Vec(size=2, min_count=1, sg=0, hs=0, negative=2) | |||
model.build_vocab(sentences) | |||
model.train(sentences) | |||
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No trailing whitespace.
Fix: Word2vec n_similarity returning numpy matrix instead of float with empty list issue #743
in n_similarity() method, if atleast one empty list is passed, it raises ZeroDivisionError.
https://patch-diff.githubusercontent.com/raw/RaRe-Technologies/gensim/pull/882.diff