Skip to content

Commit

Permalink
fix a few things, and add more details on sourmash
Browse files Browse the repository at this point in the history
  • Loading branch information
ctb committed Mar 8, 2022
1 parent a7d4209 commit 4a106df
Showing 1 changed file with 10 additions and 7 deletions.
17 changes: 10 additions & 7 deletions doc/command-line.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,19 @@ From the command line, sourmash can be used to create
[MinHash sketches][0] from DNA and protein sequences, compare them to
each other, and plot the results; these sketches are saved into
"signature files". These signatures allow you to estimate sequence
similarity quickly and accurately in large collections, among other
capabilities.
similarity and containment quickly and accurately in large
collections, among other capabilities.

sourmash also provides a suite of metagenome functionality. This
includes genome search in metagenomes, metagenome decomposition into a
list of genomes from a database, and taxonomic classification
functionality.

Please see the [mash software][1] and the
[mash paper (Ondov et al., 2016)][2] for background information on
how and why MinHash sketches work.

how and why MinHash sketches work. The [FracMinHash preprint (Irber et al,
2022)](https://www.biorxiv.org/content/10.1101/2022.01.11.475838) describes
the metagenome-focused features of sourmash.

sourmash uses a subcommand syntax, so all commands start with
`sourmash` followed by a subcommand specifying the action to be
Expand Down Expand Up @@ -102,9 +108,6 @@ Finally, there are a number of utility and information commands:
Please use the command line option `--help` to get more detailed usage
information for each command.

Note that as of sourmash v3.4, all commands should load signatures from
indexed databases (the SBT and LCA formats) as well as from signature files.

### `sourmash sketch` - make sourmash signatures from sequence data

Most of the commands in sourmash work with **signatures**, which contain information about genomic or proteomic sequences. Each signature contains one or more **sketches**, which are compressed versions of these sequences. Using sourmash, you can search, compare, and analyze these sequences in various ways.
Expand Down

0 comments on commit 4a106df

Please sign in to comment.