Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BM25, Co-occurrence Matrix, faster ROUGE, Fixing LSA. #165

Merged
merged 18 commits into from
Aug 10, 2019

Conversation

Ayushk4
Copy link
Member

@Ayushk4 Ayushk4 commented Jul 11, 2019

I am porting various implementations from StringAnalysis.jl and fixing various others.

  • Co-Occurrence Matrix

    • Code
    • Test
    • Docstrings
    • Documentation
  • BM25

    • Code
    • Test
    • Docstrings
    • Documentation
  • Speeding up Rouge.jl

  • Docstrings and Docs for Evaluation Metrics (Rouge)

  • Fixing lsa

  • Docs and tests for lsa

As per the discussions in #164 , I am preferring to port COOM from StringAnalysis.jl for various advantages discussed.

There seem to be performance bottlenecks in rouge.jl due to Abstract containers, this also needs to be worked upon.

@Ayushk4 Ayushk4 changed the title Stringanalysis patch Porting from StringAnalysis.jl [WIP[ Jul 11, 2019
@Ayushk4 Ayushk4 changed the title Porting from StringAnalysis.jl [WIP[ Porting from StringAnalysis.jl [WIP] Jul 11, 2019
@Ayushk4 Ayushk4 changed the title Porting from StringAnalysis.jl [WIP] BM25, Co-occurrence Matrix, faster ROUGE, Fixing LSA. Jul 13, 2019
@Ayushk4
Copy link
Member Author

Ayushk4 commented Jul 13, 2019

I have ported BM25 and Co-Occurrence Matrix from StringAnalysis.jl. Co-Occurrence Matrix works 10-15x faster than one in #164, uses less space, supports operations over Document and Corpus types.

LSA has been fixed. ROUGE - N has been re-implemented, supports languages, 15 - 20% improvement in speed and memory.

Tests, docstrings, online documentation added for all these.

@aviks, please review.

@aviks
Copy link
Member

aviks commented Aug 10, 2019

I've fixed merge conflicts, and added explicit license. attribution to zgornel in the coom.jl

@aviks aviks merged commit 32e8789 into JuliaText:master Aug 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants