Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mashmap does not give results if input sequences are too short #31

Open
b-brankovics opened this issue Oct 1, 2020 · 1 comment
Open

Comments

@b-brankovics
Copy link

Dear Developers,

I am using mashmap to mine homologous regions for my reference genes from genomes, and I have encountered a bug in the program. If one or both of the input sequences is shorter than a specific length then the program appears to run and exits with exit code 0, but does not produce any plots.

When using the following command:

mashmap -s 500 -r ref.fas -q target.fas -o output.mash

The ref.fas had to be at least 16100 bp long and the target.fas had to be at least 510 bp otherwise there was nothing in the output.

For me it would be already great if mashmap returned a different exit code than 0 in this case, because than I know that it failed because of input requirements and doesn't mean there are no homologous sequences.

@cjain7
Copy link
Contributor

cjain7 commented Oct 24, 2020

Yes, that is governed more or less by the algorithm. -s 500 indicates it will look for mappings of 500-long bp fragments from read to the reference by using Jaccard similarity of k-mers within them. It would not work if either query and reference are shorter.
Because this is an approximate method, it is a bit tricky to differentiate b/w the two scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants