-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] HDP #1055
[WIP] HDP #1055
Conversation
@tmylk , only python 2.6 has failed with |
That test error will be fixed after #1056 merged in |
@@ -56,6 +56,21 @@ def dirichlet_expectation(alpha): | |||
return(sp.psi(alpha) - sp.psi(np.sum(alpha, 1))[:, np.newaxis]) | |||
|
|||
|
|||
def get_random_state(seed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why copy-paste from LdaModel? Should it be moved to utils?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll move the common ldamodel
and hdpmodel
methods to the respective utils
and matutils
files after this PR is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to make a few other changes to utils
and matutils
as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be done as a part of this PR. Duplicate code will not be merged.
@tmylk tests pass! |
@tmylk what else would you want done on this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove duplicate code
@@ -56,6 +56,21 @@ def dirichlet_expectation(alpha): | |||
return(sp.psi(alpha) - sp.psi(np.sum(alpha, 1))[:, np.newaxis]) | |||
|
|||
|
|||
def get_random_state(seed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be done as a part of this PR. Duplicate code will not be merged.
def suggested_lda_model(self): | ||
""" | ||
Returns closest corresponding ldamodel object corresponding to current hdp model. | ||
The num_topics is m_T (default is 150) so as to preserve the matrice shapes when we assign alpha and beta. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is it different from hdp_to_lda
? Add a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed duplicate code, added comment.
I've moved the |
@tmylk I've addressed all your comments. |
Thanks for the PR! |
return numpy.random.RandomState(seed) | ||
if isinstance(seed, numpy.random.RandomState): | ||
return seed | ||
raise ValueError('%r cannot be used to seed a numpy.random.RandomState' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No vertical indent in gensim, please use hanging indent.
@tmylk this keeps happening over and over -- watch out for this in reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I had just copy-pasted this from the existing ldamodel
code and missed this. Fixing it in a new PR where I make some changes to utils
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bhargavvader Thanks!
* Added print methods, lda_model * Added HDP tests * Changelog * Removed duplicate code * Removed duplicate code * Added import * Fixed Changelog
This is to address issues #901, and #952, and go towards fixing #945 - basically to attempt to clean up HDP as much as possible.
This includes only the HDP changes from the closed #996.
I'll make the other cosmetic changes, and moving the appropriate methods to
utils
andmatutils
in a different PR.