-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor line2doc
methods of LowCorpus
and MalletCorpus
#2269
Conversation
5ba87a5
to
ef27d11
Compare
ef27d11
to
5f163f4
Compare
5f163f4
to
fcc9bc1
Compare
docid, doclang, words = splited_line[0], splited_line[1], splited_line[2:] | ||
split_line = utils.to_unicode(line).strip().split(None, 2) | ||
docid, doclang = split_line[0], split_line[1] | ||
words = split_line[2] if len(split_line) >= 3 else '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why >=
, not an =
? I asked because "If maxsplit is given, at most maxsplit splits are done"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a habbit to write more flexible code on the case of future changes.
line2doc
method of LowCorpus
and MalletCorpus
line2doc
method of LowCorpus
and MalletCorpus
line2doc
methods of LowCorpus
and MalletCorpus
No description provided.