You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Given the work we've been doing with maximum containment --> genome distance metrics, there are some places where it might be useful to return the maximum containment between signatures, rather than directional containment. This could be enabled with a --max-containment option for certain commands.
Cases I can think of:
compare
search -- e.g. my current desired use case is sourmash search --max-containment --best-only to find the best match for an input sig (or list of input sigs) to a list of cluster founders.
--max-containment is likely not useful for gather-style metagenome applications, primarily bc the direction of containment (intersection/reference genome hashes) is already ideal.
The text was updated successfully, but these errors were encountered:
it may not be possible to do this efficiently on SBTs - thinking out loud,
SBTs provide Jaccard similarity and containment searches for a query Q in a database D
can there be a match S in D that has low Jaccard similarity and containment but high max containment? this would be a match that cannot be found using current SBT, but would be reported for high max containment.
I can't think about that clearly, but maybe a way to rephrase it is to ask about doing containment searches (Q in D) and ((all d in D0 against Q) and taking the best match?
perhaps this issue on reverse containment is related? #1198
Given the work we've been doing with maximum containment --> genome distance metrics, there are some places where it might be useful to return the maximum containment between signatures, rather than directional containment. This could be enabled with a
--max-containment
option for certain commands.Cases I can think of:
compare
search
-- e.g. my current desired use case issourmash search --max-containment --best-only
to find the best match for an input sig (or list of input sigs) to a list of cluster founders.--max-containment
is likely not useful for gather-style metagenome applications, primarily bc the direction of containment (intersection/reference genome hashes) is already ideal.The text was updated successfully, but these errors were encountered: