Skip to content

Commit

Permalink
Refactor API Reference for gensim.parsing. Fix piskvorky#1664 (piskvo…
Browse files Browse the repository at this point in the history
…rky#1684)

* Added\fixed docstrings for strip_tags in preprocessing.py

* Added docstrings for strip_numeric, strip_non_alphanum & strip_multiple_whitespaces

* small fixes

* Added docstrings for split_alphanum, stem_text, need additional check for preprocess_string & preprocess_documents

* Fix for old stringdocs and even more!

* Additional changes for preprocessing.py and some refactoring for porter.py

* Added references for functions + some common refactoring

* Added annotations for porter.py & preprocessing.py

* Fixes for annotations

* Refactoring for Attributes and Notes fields

* Reduced some extra large docstrings

* porter.py , function _ends : changed return type from (int) to (bool)

* small fix for sections

* Cleanup porter.py

* Resolve last review

* finish with porter, yay!

* Fix preprocessing

* small changes

* Fix review comments
  • Loading branch information
CLearERR authored and KMarie1 committed Nov 26, 2017
1 parent 8d5515e commit f14b431
Show file tree
Hide file tree
Showing 3 changed files with 541 additions and 105 deletions.
5 changes: 1 addition & 4 deletions gensim/parsing/__init__.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
"""
This package contains functions to preprocess raw text
"""
"""This package contains functions to preprocess raw text"""

# bring model classes directly into package namespace, to save some typing
from .porter import PorterStemmer # noqa:F401
from .preprocessing import (remove_stopwords, strip_punctuation, strip_punctuation2, # noqa:F401
strip_tags, strip_short, strip_numeric,
Expand Down
Loading

0 comments on commit f14b431

Please sign in to comment.