Crash on 2.8 GB data: "EXCEPTION: Pool allocation failed" #9

KirillKryukov · 2019-11-14T03:49:30Z

Leon compression crashes on some data. Example data:

leon-repro-1.fa.gz (784 MB archive, inside is a 2.8 GB file).

Command to reproduce (after decompressing the gzipped data):

leon -seq-only -file leon-repro-1.fa -c -kmer-size 3

This command crashes with the following colsole output:

        Input format: Fasta
[DSK: Pass 1/1, Step 2: counting kmers   ]  70.5 %   elapsed:   0 min 28 sec   remaining:   0 min 12 sec   cpu: 472.6 %   mem: [  66,   66,   66] MB EXCEPTION: Pool allocation failed for 3012690144 bytes (kmers alloc). Current usage is 16 and capacity is 2097152000

Also after crash Leon leaves 85 temporary files in current directory, totaling 21 GB.

I noticed that Leon paper mentions using Leon on a 733 GB data. Therefore I assumed that comparatively small data size of 2.8 GB should be no problem.

The text was updated successfully, but these errors were encountered:

rchikhi · 2019-11-14T16:04:16Z

Hi Kirill, you could try increasing the default -max-memory value and see if it still crashes. Having a k-mer size of 3 seems also problematic. Was that a typo?
Rayan

KirillKryukov · 2019-11-18T02:03:30Z

@rchikhi , how to use the -max-memory? Is it a command line option of leon? Is it set in bytes, kilobytes, megabytes or gigabytes? It's not mentioned in leon console output:

[leon options]
       -file         (1 arg) :    input file (e.g. FASTA/FASTQ for compress or .leon file for decompress)
       -c            (0 arg) :    compression
       -d            (0 arg) :    decompression
       -nb-cores     (1 arg) :    number of cores (default is the available number of cores)  [default '0']
       -verbose      (1 arg) :    verbosity level  [default '1']
       -lossless     (0 arg) :    switch to lossless compression for qualities (default is lossy. lossy has much higher compression rate, and the loss is in fact a gain. lossy is better!)

   [compression options]
          -kmer-size               (1 arg) :    size of a kmer  [default '31']
          -abundance               (1 arg) :    abundance threshold for solid kmers (default inferred)  [default '']
          -seq-only                (0 arg) :    store dna seq only, header and quals are discarded, will decompress to fasta (same as -noheader -noqual)
          -noheader                (0 arg) :    discard header
          -noqual                  (0 arg) :    discard quality scores

3 is not a typo. Is there a known problem with this setting?

rchikhi · 2019-11-18T16:18:31Z

I'm sorry, I thought leon exposed this parameter, it does not. If it still accepts it, it's in megabytes. So try e.g. -max-memory 10000.
Kmer size of 3 is very problematic. What's your rationale for it? Leon should perform well with specific kmers, e.g. likely above 12 or 15, preferably in the range [20;50].

KirillKryukov · 2019-11-20T13:13:11Z

Thank you @rchikhi , I will try this parameter and let you know if it helped.

As for the rationale, I am doing parameter sweep to find the optimal kmer size for various kinds of data. Since Leon accepts kmer sizes starting from 2, and since until now I haven't seen any recommendation for avoiding small kmer sizes (may be I missed it?), I am testing the entire range. Now, thanks to your very helpful advice, I can probably ignore kmer sizes smaller than 12, if I understand you correctly?

rchikhi · 2019-11-20T13:47:47Z

yes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash on 2.8 GB data: "EXCEPTION: Pool allocation failed" #9

Crash on 2.8 GB data: "EXCEPTION: Pool allocation failed" #9

KirillKryukov commented Nov 14, 2019 •

edited

Loading

rchikhi commented Nov 14, 2019

KirillKryukov commented Nov 18, 2019

rchikhi commented Nov 18, 2019

KirillKryukov commented Nov 20, 2019

rchikhi commented Nov 20, 2019

Crash on 2.8 GB data: "EXCEPTION: Pool allocation failed" #9

Crash on 2.8 GB data: "EXCEPTION: Pool allocation failed" #9

Comments

KirillKryukov commented Nov 14, 2019 • edited Loading

rchikhi commented Nov 14, 2019

KirillKryukov commented Nov 18, 2019

rchikhi commented Nov 18, 2019

KirillKryukov commented Nov 20, 2019

rchikhi commented Nov 20, 2019

KirillKryukov commented Nov 14, 2019 •

edited

Loading