Skip to content

Commit

Permalink
Merge pull request #199 from HadrienG/dev
Browse files Browse the repository at this point in the history
1.5.2
  • Loading branch information
HadrienG authored Feb 1, 2021
2 parents 52ad1fb + 9662b8a commit fc46082
Show file tree
Hide file tree
Showing 3 changed files with 35 additions and 6 deletions.
17 changes: 16 additions & 1 deletion doc/iss/generate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -135,13 +135,28 @@ Coverage distribution

In the context of InSilicoSeq, the `abundance` is the proportion of reads in a sample, which since it does not acount for the length of the genome, does not necessarily reflect the number of organisms present in a sample.

The `coverage`and `coverage_file` options allow for simulating reads according to a coverage distribution instead of abundance.
The ``coverage`` and ``coverage_file`` options allow for simulating reads according to a coverage distribution instead of abundance.

.. code-block:: bash
iss generate --ncbi bacteria -U 50 --coverage lognormal -n 25M \
--model novaseq --output reads
The ``coverage_file`` option works similarly to the ``abundance_file`` option.
For two genomes A and B:

.. code-block:: bash
iss generate --genomes genomes.fasta --coverage_file coverage.txt \
--model novaseq --output reads
with, for a coverage of 20x for genome_A and 100x for genome_B, the coverage file `coverage.txt` will be:

.. code-block:: bash
genome_A 20
genome_B 100
GC bias
-------

Expand Down
22 changes: 18 additions & 4 deletions iss/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,8 @@ def generate_reads(args):
f = open(genome_file, 'r') # re-opens the file
with f:
fasta_file = SeqIO.parse(f, 'fasta')

total_reads_generated = 0
total_reads_generated_unrouned = 0
for record in fasta_file:
# generate reads for records
try:
Expand All @@ -252,9 +253,22 @@ def generate_reads(args):
err_mod.read_length,
genome_size
)
n_pairs = int(round(
(coverage *
len(record.seq)) / err_mod.read_length) / 2)
n_pairs_unrounded = (
(coverage * len(record.seq)) /
err_mod.read_length) / 2
n_pairs = round(n_pairs_unrounded)

# check that the rounding does not cause to drop
# read pairs
total_reads_generated_unrouned += n_pairs_unrounded
total_reads_generated += n_pairs
if round(total_reads_generated_unrouned) > \
total_reads_generated:
logger.debug(
"Adding a pair to correct rounding error")
n_pairs += 1
total_reads_generated += 1

# skip record if n_reads == 0
if n_pairs == 0:
continue
Expand Down
2 changes: 1 addition & 1 deletion iss/version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '1.5.1'
__version__ = '1.5.2'

0 comments on commit fc46082

Please sign in to comment.