Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sourmash 3.x protein ksizes must be divisible by 3 #1019

Closed
wants to merge 24 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
cfec037
start adding moltype into LCA DB construction
ctb Jun 5, 2020
c7fb21f
save/load moltype in LCA database JSON
ctb Jun 6, 2020
c158531
support all the moltypes
ctb Jun 6, 2020
c91df57
fix test <shakes head>
ctb Jun 6, 2020
ee0d0d9
cleaned up reporting; fixed used-lineage tracking logic
ctb Jun 6, 2020
b55a4b6
test output ksize/scaled/moltype
ctb Jun 6, 2020
7d4c535
rename bug 781 sig filename
ctb Jun 6, 2020
c479a37
fix sourmash sig describe moltype output
ctb Jun 6, 2020
4391f50
remove unnecessary tempdir use
ctb Jun 6, 2020
0a931c1
LCA_Database.insert now returns number of hashes inserted
ctb Jun 6, 2020
e7275b1
test lca database creation, search, etc. for protein, hp, dayhoff
ctb Jun 6, 2020
c1f1f17
command line tests for lca indexing with protein, hp, dayhoff
ctb Jun 6, 2020
81d67b8
bump LCA database version number
ctb Jun 6, 2020
045d627
test command line search and gather of prot/hp/dayhoff LCA databases
ctb Jun 6, 2020
21ddfc2
remove requirement for --no-dna lca index
ctb Jun 6, 2020
32c90e9
refactor calculate_moltype and test
ctb Jun 6, 2020
5c33231
add sbt search/index tests for protein/hp/dayhoff
ctb Jun 6, 2020
bd1b320
add full tests for sig describe on hp, dayhoff, and protein
ctb Jun 6, 2020
e9d333f
test behavior of lca_db.insert better
ctb Jun 6, 2020
ed36010
resolved comment (I did a bad copy paste)
ctb Jun 6, 2020
3ec04b5
protein ksizes must be divisible by 3
bluegenes Jun 9, 2020
49eb2a5
Merge branch 'master' into div-3
ctb Jun 14, 2020
a58564c
change test to assert failure with bad ksize
bluegenes Jun 15, 2020
9c46e9c
use status instead of presence of file
bluegenes Jun 15, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions sourmash/command_compute.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def compute(args):
if args.num_hashes != 0:
notify('setting num_hashes to 0 because --scaled is set')
args.num_hashes = 0

notify('computing signatures for files: {}', ", ".join(args.filenames))

if args.randomize:
Expand Down Expand Up @@ -95,7 +95,7 @@ def compute(args):
'signatures.')
num_sigs = len(ksizes)

if (args.protein or args.dayhoff or args.hp) and not args.input_is_protein:
if (args.protein or args.dayhoff or args.hp):
bad_ksizes = [ str(k) for k in ksizes if k % 3 != 0 ]
if bad_ksizes:
error('protein ksizes must be divisible by 3, sorry!')
Expand Down Expand Up @@ -236,7 +236,7 @@ def _compute_individual(args):
siglist = []

assert not siglist # juuuust checking.


def _compute_merged(args):
# make minhashes for the whole file
Expand Down
5 changes: 3 additions & 2 deletions tests/test_sourmash_compute.py
Original file line number Diff line number Diff line change
Expand Up @@ -576,7 +576,7 @@ def test_do_sourmash_compute_multik_only_protein(c):
assert 30 in ksizes


def test_do_sourmash_compute_multik_protein_input_non_div3_ksize():
def test_do_sourmash_compute_multik_protein_input_bad_ksize():
with utils.TempDirectory() as location:
testdata1 = utils.get_test_data('short-protein.fa')
status, out, err = utils.runscript('sourmash',
Expand All @@ -587,7 +587,8 @@ def test_do_sourmash_compute_multik_protein_input_non_div3_ksize():
in_directory=location,
fail_ok=True)
outfile = os.path.join(location, 'short-protein.fa.sig')
assert os.path.exists(outfile)
assert status != 0
assert 'protein ksizes must be divisible by 3' in err


@utils.in_tempdir
Expand Down