Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major Documentation Revamp #134

Merged
merged 9 commits into from
Mar 24, 2019
Merged

Conversation

Ayushk4
Copy link
Member

@Ayushk4 Ayushk4 commented Mar 15, 2019

This PR is aimed at improving the docs in the following ways -

  • Adding examples
  • Make it easy to understand
  • Improving the styling
  • Fixing typos and grammatical errors

Also testing for the possible errors in the package.

Progress

  • Documents
    • Creating documents
    • Basic functions for working with Documents
    • Document Metadata
    • Preprocessing
  • Corpus
    • Creating a Corpus
    • Standardizing a Corpus
    • Processing a Corpus
    • Corpus Statistics
    • Converting a DataFrame from a Corpus
    • Corpus Metadata
  • Features
    • Creating a Document Term Matrix
    • Creating Individual Rows of a DTM
    • Hash Trick
    • TF-IDF
  • Semantic
    • LSA
    • LDA

@Ayushk4 Ayushk4 changed the title [WIP] Major Documentation Revamps [WIP] Major Documentation Revamp Mar 16, 2019
@Ayushk4
Copy link
Member Author

Ayushk4 commented Mar 16, 2019

I am also replacing the deprecated functions in the documentation. So, far I have finished documents.md.

@Ayushk4
Copy link
Member Author

Ayushk4 commented Mar 17, 2019

@aviks I was going through the docs. I found that names (used for setting metadata - author entire corpus of docs name for the docs in the corpus ), conflicts with the base refer (v0.7) and (v1). What do you suggest about this? Should I open an issue?

@aviks
Copy link
Member

aviks commented Mar 17, 2019 via email

@Ayushk4
Copy link
Member Author

Ayushk4 commented Mar 18, 2019

A correction - Author/authors is already used for document's author, should I change the metadata containing the document name to documentName. Also, I am sending a separate PR for this.

@Ayushk4
Copy link
Member Author

Ayushk4 commented Mar 18, 2019

The following are the metadata - names -> setting the document's name and authors for setting the author's name.

This is also missing from the documentation for accessing the corpus metadata, I am adding it.

@zgornel
Copy link

zgornel commented Mar 18, 2019

  • Documenter.jl seems to be at this point a better choice since all code snippets are (re-)evaluated at branch modifications and reflect changes in the code. Is there any reason for not using it ?

  • A more flexible approach to the metadata field names is to allow a metadata type hierarchy (i.e. define an AbstractMetadata type) https://zgornel.github.io/StringAnalysis.jl/dev/doc_extensions/

@Ayushk4
Copy link
Member Author

Ayushk4 commented Mar 18, 2019

@Ayushk4
Copy link
Member Author

Ayushk4 commented Mar 18, 2019

I think that maybe we can start by opening an issue of what all major changes are to be done for TextAnalysis.jl, taking into consideration the things mentioned in the conversation above.
Thank you, for your review on the situation.

@Ayushk4
Copy link
Member Author

Ayushk4 commented Mar 18, 2019

@aviks Can you please suggest your views on this, so that I may proceed accordingly.

docs/src/corpus.md Outdated Show resolved Hide resolved
docs/src/corpus.md Outdated Show resolved Hide resolved
@aviks
Copy link
Member

aviks commented Mar 18, 2019

Thanks Ayush, this looks good. Let me know once this is ready to merge. I'm happy to merge things over from StringAnalysis if Corneliu is ok with that.

@zgornel
Copy link

zgornel commented Mar 18, 2019

@aviks I'm perfectly fine with this; feel free to port to TextAnalysis whatever you see fit

@aviks
Copy link
Member

aviks commented Mar 18, 2019 via email

@Ayushk4
Copy link
Member Author

Ayushk4 commented Mar 20, 2019

@aviks I am almost done with the PR. All that remains to be done, is #135 being merged so that I can remove the merge conflicts that may arise.

Also, lsa() is not working. I will get more details about it and file a detailed issue.

@Ayushk4
Copy link
Member Author

Ayushk4 commented Mar 24, 2019

@aviks Do you think, BM25 and Co-occurrence matrix will be a good addition to TextAnalysis.jl?

@aviks aviks merged commit 02ab2cd into JuliaText:master Mar 24, 2019
@aviks
Copy link
Member

aviks commented Mar 24, 2019

@aviks Do you think, BM25 and Co-occurrence matrix will be a good addition to TextAnalysis.jl?

Yes please!

@Ayushk4 Ayushk4 changed the title [WIP] Major Documentation Revamp Major Documentation Revamp May 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants