-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly process empty documents in AuthorTopicModel
#2133
Merged
Merged
Changes from 5 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
accc625
test for #1589
probinso e3e47ef
bugfix #1589
probinso 7b7633d
Merge branch 'develop' into fix_1589
probinso db74531
ignore unused assigned varaible
probinso 8aa04b2
PR review
probinso ddf8dec
Update test_atmodel.py
probinso File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -35,7 +35,6 @@ | |
# increases the bound. | ||
# Test that models are compatiple across versions, as done in LdaModel. | ||
|
||
|
||
# Assign some authors randomly to the documents above. | ||
author2doc = { | ||
'john': [0, 1, 2, 3, 4, 5, 6], | ||
|
@@ -110,6 +109,17 @@ def testBasic(self): | |
jill_topics = matutils.sparse2full(jill_topics, model.num_topics) | ||
self.assertTrue(all(jill_topics > 0)) | ||
|
||
def testEmptyDocument(self): | ||
local_texts = common_texts + [['only_occurs_once_in_corpus_and_alone_in_doc']] | ||
dictionary = Dictionary(local_texts) | ||
dictionary.filter_extremes(no_below=2) | ||
corpus = [dictionary.doc2bow(text) for text in local_texts] | ||
a2d = author2doc.copy() | ||
a2d['joaquin'] = [len(local_texts) - 1] | ||
|
||
_ = self.class_(corpus, author2doc=a2d, id2word=dictionary, num_topics=2) | ||
assert(_) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better to retrieve vector for any document or corpus (instead of assertion) as "sanity check" action, because |
||
|
||
def testAuthor2docMissing(self): | ||
# Check that the results are the same if author2doc is constructed automatically from doc2author. | ||
model = self.class_( | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with this
np.integer
type. How does it differ from normalnp.int
? What's the difference, why use one or the other?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No difference in our case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of it "casted" to
int64
on my x64 linux