-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix docstrings for lsi-related code (#1892)
* Added numpy style docstrings to all functions, methods and classes in the LsiModel module. Types need to be checked before merging * Fixed generic container type (stream/list -> iterable) and added expected shape for sparse matrix arguments * Fix PEP-8 * Applied corrections mentioned in code review. * General class level remarks moved to class docstring from `__init__` * References to `gensim` classes are now using `sphinx` notation. * Numpy parameters annotated with `np.type` instead of `type`. * Added docstrings for `lsi_worker` and `lsi_dispatcher` * added argument parsing and fixed __doc__ * update configs with new extension sphinxcontrib.programoutput * added blank link in __doc__ * sphinx identation fix * chmod revert * fix lsimodel[1] * fix lsimodel[2] * fix lsimodel[3] * fix lsimodel[4] * fix basemodel * fixes * fix lsi_worker & missing fields in .rst * last fixes for worker & dispatcher * add missing link
- Loading branch information
1 parent
0659c10
commit 0db8796
Showing
6 changed files
with
908 additions
and
484 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,4 +5,5 @@ | |
:synopsis: Dispatcher for distributed LSI | ||
:members: | ||
:inherited-members: | ||
|
||
:undoc-members: | ||
:show-inheritance: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,4 +5,5 @@ | |
:synopsis: Worker for distributed LSI | ||
:members: | ||
:inherited-members: | ||
|
||
:undoc-members: | ||
:show-inheritance: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,24 +1,51 @@ | ||
class BaseTopicModel(object): | ||
def print_topic(self, topicno, topn=10): | ||
""" | ||
Return a single topic as a formatted string. See `show_topic()` for parameters. | ||
"""Get a single topic as a formatted string. | ||
Parameters | ||
---------- | ||
topicno : int | ||
Topic id. | ||
topn : int | ||
Number of words from topic that will be used. | ||
>>> lsimodel.print_topic(10, topn=5) | ||
'-0.340 * "category" + 0.298 * "$M$" + 0.183 * "algebra" + -0.174 * "functor" + -0.168 * "operator"' | ||
Returns | ||
------- | ||
str | ||
String representation of topic, like '-0.340 * "category" + 0.298 * "$M$" + 0.183 * "algebra" + ... '. | ||
""" | ||
return ' + '.join(['%.3f*"%s"' % (v, k) for k, v in self.show_topic(topicno, topn)]) | ||
|
||
def print_topics(self, num_topics=20, num_words=10): | ||
"""Alias for `show_topics()` that prints the `num_words` most | ||
probable words for `topics` number of topics to log. | ||
Set `topics=-1` to print all topics.""" | ||
"""Get the most significant topics (alias for `show_topics()` method). | ||
Parameters | ||
---------- | ||
num_topics : int, optional | ||
The number of topics to be selected, if -1 - all topics will be in result (ordered by significance). | ||
num_words : int, optional | ||
The number of words to be included per topics (ordered by significance). | ||
Returns | ||
------- | ||
list of (int, list of (str, float)) | ||
Sequence with (topic_id, [(word, value), ... ]). | ||
""" | ||
return self.show_topics(num_topics=num_topics, num_words=num_words, log=True) | ||
|
||
def get_topics(self): | ||
""" | ||
Returns: | ||
np.ndarray: `num_topics` x `vocabulary_size` array of floats which represents | ||
the term topic matrix learned during inference. | ||
"""Get words X topics matrix. | ||
Returns | ||
-------- | ||
numpy.ndarray: | ||
The term topic matrix learned during inference, shape (`num_topics`, `vocabulary_size`). | ||
Raises | ||
------ | ||
NotImplementedError | ||
""" | ||
raise NotImplementedError |
Oops, something went wrong.