Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test and refactor WikiCorpus #1821

Merged
merged 9 commits into from
Jan 11, 2018

Commits on Dec 28, 2017

  1. Configuration menu
    Copy the full SHA
    574134e View commit details
    Browse the repository at this point in the history
  2. Created test data in legitimate compressed XML format (.xml.bz2) for …

    …the WikiCorpus class.
    
    * Used the same raw data found for other sources (9 articles).
    
    * Added Various wiki markup to test the parsing regural expressions
    steremma committed Dec 28, 2017
    Configuration menu
    Copy the full SHA
    952e8d5 View commit details
    Browse the repository at this point in the history
  3. Added test class for the WikiCorpus source.

    * Following the same inheritance schema as in the source TestWikiCorpus > TestTextCorpus > CorpusTestCase.
    
    * Testing methods are overriden where necessary to reflect logic changes.
    
    * All existing functionality is tested (account for markup handling, minimum article length etc)
    steremma committed Dec 28, 2017
    Configuration menu
    Copy the full SHA
    7ddce6c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    836c3c2 View commit details
    Browse the repository at this point in the history

Commits on Dec 30, 2017

  1. code review corrections

    steremma committed Dec 30, 2017
    Configuration menu
    Copy the full SHA
    43a48f5 View commit details
    Browse the repository at this point in the history
  2. Moved WikiCorpus tests from test/test_wikicorpus.py into its class wi…

    …thin the test_corpora.py file.
    
    * Adapted all old tests to the new class
    
    * Current Test class schema ensures that WikiCorpus also passes tests defined in parents
    
    * Deleted test_wikicorpus.py since it is now redundant
    steremma committed Dec 30, 2017
    Configuration menu
    Copy the full SHA
    8b7a1d5 View commit details
    Browse the repository at this point in the history

Commits on Jan 10, 2018

  1. Configuration menu
    Copy the full SHA
    eeea748 View commit details
    Browse the repository at this point in the history
  2. Discarded the empty input test for the WikiCorpus since an empty file…

    … is not legitimate XML
    steremma committed Jan 10, 2018
    Configuration menu
    Copy the full SHA
    b5976c4 View commit details
    Browse the repository at this point in the history

Commits on Jan 11, 2018

  1. Added 2 more tests

    steremma committed Jan 11, 2018
    Configuration menu
    Copy the full SHA
    78f2870 View commit details
    Browse the repository at this point in the history