Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch to using gather exclusively, away from LCA methods. #110

Merged
merged 36 commits into from
Aug 26, 2020

Commits on Jun 1, 2020

  1. Configuration menu
    Copy the full SHA
    a1cf40e View commit details
    Browse the repository at this point in the history

Commits on Jun 8, 2020

  1. Configuration menu
    Copy the full SHA
    1464a50 View commit details
    Browse the repository at this point in the history

Commits on Jun 9, 2020

  1. more/better reporting

    ctb committed Jun 9, 2020
    Configuration menu
    Copy the full SHA
    6bec2d1 View commit details
    Browse the repository at this point in the history

Commits on Jul 9, 2020

  1. Configuration menu
    Copy the full SHA
    5dfbe74 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6bb1b38 View commit details
    Browse the repository at this point in the history

Commits on Jul 11, 2020

  1. Configuration menu
    Copy the full SHA
    5eccf44 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    14b2a49 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    412e48c View commit details
    Browse the repository at this point in the history
  4. do clever gather things

    ctb committed Jul 11, 2020
    Configuration menu
    Copy the full SHA
    bff2d04 View commit details
    Browse the repository at this point in the history

Commits on Jul 12, 2020

  1. improve report.txt output

    ctb committed Jul 12, 2020
    Configuration menu
    Copy the full SHA
    8a0ea99 View commit details
    Browse the repository at this point in the history
  2. cleanup on aisle 2

    ctb committed Jul 12, 2020
    Configuration menu
    Copy the full SHA
    ef02e7e View commit details
    Browse the repository at this point in the history
  3. remove unused bad_hashes

    ctb committed Jul 12, 2020
    Configuration menu
    Copy the full SHA
    94c2c8a View commit details
    Browse the repository at this point in the history
  4. move gather_at_rank to utils

    ctb committed Jul 12, 2020
    Configuration menu
    Copy the full SHA
    e57949e View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    189f33c View commit details
    Browse the repository at this point in the history
  6. try evaluating GTDB contamination (#122)

    * add explicit moltype to config
    
    * add protein conf file
    
    * first full successful run
    
    * add gather_scaled to do faster searches
    
    * adjust thresholds for protein
    
    * use search rather than gather so we can look at GTDB 25k genomes
    
    * upd
    
    * refactor exact removal a bit
    
    * improve reporting
    
    * improve reporting
    
    * silence lineages file warnings
    
    * add gtdb conf
    
    * require provided lineage to remove exact match
    
    * flag identical matches removed
    
    * clean up reporting etc
    ctb authored Jul 12, 2020
    Configuration menu
    Copy the full SHA
    f823001 View commit details
    Browse the repository at this point in the history

Commits on Jul 15, 2020

  1. adjust lineages to just eukaryota

    ctb committed Jul 15, 2020
    Configuration menu
    Copy the full SHA
    99dd1b2 View commit details
    Browse the repository at this point in the history
  2. sourmash 3.4, yay

    ctb committed Jul 15, 2020
    Configuration menu
    Copy the full SHA
    4f7b36d View commit details
    Browse the repository at this point in the history

Commits on Jul 17, 2020

  1. simple post-summary script

    ctb committed Jul 17, 2020
    Configuration menu
    Copy the full SHA
    44d6089 View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2020

  1. Add snakemake tests that actually run snakemake (#129)

    * initial attempt to add snakemake tests
    
    * snakemake tests working
    
    * add test on protein side, too
    ctb authored Aug 22, 2020
    Configuration menu
    Copy the full SHA
    ad040a5 View commit details
    Browse the repository at this point in the history

Commits on Aug 23, 2020

  1. misc

    ctb committed Aug 23, 2020
    Configuration menu
    Copy the full SHA
    adf410f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    216bbe9 View commit details
    Browse the repository at this point in the history

Commits on Aug 25, 2020

  1. Refactor whole genome gather into a multistage pipeline (#130)

    * attempted refactor try 1
    
    * look at many genomes at once
    
    * tentatively working?
    
    * start generating summaries
    
    * do progress elimination/reporting
    
    * add summarize rule
    
    * add provided lin to ibd2
    
    * output CSV
    
    * recover comment
    
    * add f_major and f_ident to output
    
    * update for tara-delmont
    
    * remove lineage split
    
    * get protein working
    
    * fix provided lineage bug
    
    * fix self-matching
    
    * fix snakemake tests
    
    * write actual genome cleaning code
    
    * make min_f_ident and min_f_major configurable
    
    * swizzle f_match and f_ident rows to later; sort output by total_bad_bp
    
    * clean up imports a bit
    
    * 'fix' tests by commenting many of them out
    
    * pyflakes cleanup
    
    * refactor and cleanup
    
    * refactor and cleanup
    
    * refactor and cleanup
    
    * require genus-level match rank, for now.
    
    * eliminate match_rank as a configurable parameter
    
    * make use of gather threshold in contig classification
    
    * more refactoring and cleanup
    
    * rescue contigs that still have significant matches to correct lineage
    ctb authored Aug 25, 2020
    Configuration menu
    Copy the full SHA
    fa22536 View commit details
    Browse the repository at this point in the history
  2. move contigs loading into utils

    ctb committed Aug 25, 2020
    Configuration menu
    Copy the full SHA
    f1686e0 View commit details
    Browse the repository at this point in the history
  3. move contigs loading into utils

    ctb committed Aug 25, 2020
    Configuration menu
    Copy the full SHA
    72c1d5d View commit details
    Browse the repository at this point in the history

Commits on Aug 26, 2020

  1. fix mistake around 'found'

    ctb committed Aug 26, 2020
    Configuration menu
    Copy the full SHA
    f2fd307 View commit details
    Browse the repository at this point in the history
  2. misc cleanup of Snakefile

    ctb committed Aug 26, 2020
    Configuration menu
    Copy the full SHA
    6b57827 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    670af49 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    00b3854 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    c938dcb View commit details
    Browse the repository at this point in the history
  6. typo

    ctb committed Aug 26, 2020
    Configuration menu
    Copy the full SHA
    d9e6d67 View commit details
    Browse the repository at this point in the history
  7. pyflakes cleanup

    ctb committed Aug 26, 2020
    Configuration menu
    Copy the full SHA
    9714527 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    90951e2 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    a77bb2e View commit details
    Browse the repository at this point in the history
  10. minor cleanup

    ctb committed Aug 26, 2020
    Configuration menu
    Copy the full SHA
    0d0e096 View commit details
    Browse the repository at this point in the history
  11. revamp docs a bit

    ctb committed Aug 26, 2020
    Configuration menu
    Copy the full SHA
    3d391b1 View commit details
    Browse the repository at this point in the history
  12. a bit more info

    ctb committed Aug 26, 2020
    Configuration menu
    Copy the full SHA
    b0756ea View commit details
    Browse the repository at this point in the history