Tests passing, merge with devel_python3 #4

troycomi · 2019-04-09T18:21:21Z

Had to revert some argument parsing to make unit test passing faster. Will work on replacing arg parsing with click/yaml.

But is ~10x slower, likely due to seeking through file

Changed the implementation of Region_Reader to yield headers and seqs This manages to cut the memory (somehow from 60 to 1 MB) and the runtime from 2 minutes to 10 seconds (last commit at 13 minutes).

Changed filter2 to use numpy and the new yielded regions Execution is ~10s and uses 1.5 MB memory In addition to floating point precision differences, noticed difference in sorting of alt ids when the values are equal going form python 2 and 3. Handled differences in comparison scripts which format to 10 digits and sort ids when values are equal.

Changed operation of threshold scan to limit number of read throughs of the region files.

Limited changes as original method was fairly fast

Conflicts: code/analyze/filter_1_main.py code/analyze/filter_2_main.py code/analyze/filter_helpers.py code/analyze/id_regions_main.py code/analyze/predict.py code/analyze/predict_main.py code/analyze/summarize_region_quality.py Unit tests running, but failing on new implementation of args/gp

Mocked and encoded constants enough to get things passing

Added a yaml version of the config to replace global params and setup args files. Helper script clean_config performs lookups of referenced entries in config. Have not added to main methods yet.

Started moving predict methods into a Predictor class to clean up the long argument lists for performing a prediction. Need to test new code and get old tests passing again with minimal object

Changed the implementation of predict_main.py into click with support for the new yaml configuration file. Refactored predict into two main objects to simplify the main code. Added a README.md

Modified behavior with missing files to match previous implementation (continue). Currently matching original implementation on chromosome 1.

Added log file option and config. When set, a progress bar is displayed on the console with click.

Added class for adding region ids and integrated with the main click method. Created new class to hold configuration and handle setting logic, as that was heavily reused between main methods.

Move the validation code to the corresponding main objects Seeing that positions is required for summarize, went back and made that required.

Refactored summarize region quality main and supporting module with cleaner implementation. Added to main click method and all supporting unit tests.

Combined two filter steps, helper functions, and threshold sweep into a single file for further refactoring.

Continued refactoring of main methods onto filtering. Part of the changes saw a modification to the configuration object to simplify setting code into a more uniform interface. Have started checking flake8 on entire project, fixing occasionally.

Finished refactor and testing of summarize_strain_states Finished formatting code consistent with FLAKE8

troycomi added 30 commits March 20, 2019 08:46

Filter tests migrated to python 3

00700bb

Filter 1 working with single gz file using seek

4fe85ec

Filter 1 running, matching output

1a94214

But is ~10x slower, likely due to seeking through file

Filter 1 with region yield

f77e1ef

Changed the implementation of Region_Reader to yield headers and seqs This manages to cut the memory (somehow from 60 to 1 MB) and the runtime from 2 minutes to 10 seconds (last commit at 13 minutes).

Adding docstrings and return types

a1c6155

Filter 2 thresholds refactor

8844007

Changed operation of threshold scan to limit number of read throughs of the region files.

Summarize Strain under test, refactored

7c82f95

Limited changes as original method was fairly fast

Working on documentation

3267582

Suppress log warnings

58634eb

Tests passing

96d6cfb

Mocked and encoded constants enough to get things passing

Config yaml

4be5be7

Added a yaml version of the config to replace global params and setup args files. Helper script clean_config performs lookups of referenced entries in config. Have not added to main methods yet.

Working on predict main

9102a16

Started Predict refactor

0ee0ecf

Started moving predict methods into a Predictor class to clean up the long argument lists for performing a prediction. Need to test new code and get old tests passing again with minimal object

Translated predict_main

dfc0846

Changed the implementation of predict_main.py into click with support for the new yaml configuration file. Refactored predict into two main objects to simplify the main code. Added a README.md

Fixed readme

85eaac8

Tested predict main

29ae638

Modified behavior with missing files to match previous implementation (continue). Currently matching original implementation on chromosome 1.

Log file, progress bar

9438339

Added log file option and config. When set, a progress bar is displayed on the console with click.

Refactor id_regions

3fef252

Added class for adding region ids and integrated with the main click method. Created new class to hold configuration and handle setting logic, as that was heavily reused between main methods.

Moved validate code, required position

1bd8cf2

Move the validation code to the corresponding main objects Seeing that positions is required for summarize, went back and made that required.

Refactor summarize region quality

c42b6fd

Refactored summarize region quality main and supporting module with cleaner implementation. Added to main click method and all supporting unit tests.

Combined filter methods

1ac85ed

Combined two filter steps, helper functions, and threshold sweep into a single file for further refactoring.

Filter Regions Refactor

2667a6a

Continued refactoring of main methods onto filtering. Part of the changes saw a modification to the configuration object to simplify setting code into a more uniform interface. Have started checking flake8 on entire project, fixing occasionally.

Flake8 passing, summarize_strains refactored

b00140a

Finished refactor and testing of summarize_strain_states Finished formatting code consistent with FLAKE8

Updated readme, removed unused argument parser

21c52ef

Update readme

df69936

Working on setting up travisci and codecov

d113279

Travis debugging

8b266e5

Travis debugging

deaea99

troycomi added 4 commits June 10, 2019 10:34

Travis debugging

000385d

Travis debugging

a56b5a0

Travis debugging

1cfd268

Badges in readme

3c2fa43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests passing, merge with devel_python3 #4

Tests passing, merge with devel_python3 #4

troycomi commented Apr 9, 2019

Tests passing, merge with devel_python3 #4

Are you sure you want to change the base?

Tests passing, merge with devel_python3 #4

Conversation

troycomi commented Apr 9, 2019