You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm not sure if this is an intended use case for Neptune, but I attempted to run the program with ~150 inclusion genomes (450 MB) and ~8000 exclusion genomes (32 GB) and it caused the program to crash before completion. Here is the log from my console:
Estimating k-mer size ...
k = 25
k-mer Counting...
Submitted 8164 jobs.
44.61319 seconds
k-mer Aggregation...
Submitted 65 jobs.
Traceback (most recent call last):
File "/home/dussaultf/miniconda3/envs/neptune/bin/neptune-conda", line 11, in <module>
load_entry_point('neptune==1.2.5', 'console_scripts', 'neptune')()
File "/home/dussaultf/miniconda3/envs/neptune/lib/python2.7/site-packages/neptune/Neptune.py", line 986, in main
parse(parameters)
File "/home/dussaultf/miniconda3/envs/neptune/lib/python2.7/site-packages/neptune/Neptune.py", line 765, in parse
executeParallel(parameters)
File "/home/dussaultf/miniconda3/envs/neptune/lib/python2.7/site-packages/neptune/Neptune.py", line 749, in executeParallel
execute(execution)
File "/home/dussaultf/miniconda3/envs/neptune/lib/python2.7/site-packages/neptune/Neptune.py", line 662, in execute
aggregateKMers(execution, inclusionKMerLocations, exclusionKMerLocations)
File "/home/dussaultf/miniconda3/envs/neptune/lib/python2.7/site-packages/neptune/Neptune.py", line 290, in aggregateKMers
inclusionKMerLocations, exclusionKMerLocations)
File "/home/dussaultf/miniconda3/envs/neptune/lib/python2.7/site-packages/neptune/Neptune.py", line 356, in aggregateMultipleFiles
execution.jobManager.runJobs(jobs)
File "/home/dussaultf/miniconda3/envs/neptune/lib/python2.7/site-packages/neptune/JobManagerParallel.py", line 138, in runJobs
self.synchronize(jobs)
File "/home/dussaultf/miniconda3/envs/neptune/lib/python2.7/site-packages/neptune/JobManagerParallel.py", line 178, in synchronize
job.get() # get() over wait() to propagate excetions upwards
File "/home/dussaultf/miniconda3/envs/neptune/lib/python2.7/multiprocessing/pool.py", line 572, in get
raise self._value
IOError: [Errno 24] Too many open files: '/mnt/scratch/Forest/neptune_analysis/output_debug/kmers/exclusion/GCF_001642675.1_ASM164267v1_genomic.fna.kmers.AAA'
The text was updated successfully, but these errors were encountered:
What's happening is that each aggregation job is opening up a temporary file associated with each input file (~150 + ~8000). I suspect Python is unable to open ~8150 files simultaneously and is throwing this error.
The problem is there's currently no way to change the input parameters so that this doesn't happen. The number of aggregation jobs being run can be changed, but each job is still going to try to open as many files simultaneously as there are inputs.
The short term solution would be to run Neptune with less input files. I believe the biggest we've run the software is with approximately 800 total input files. The long term solution (on my end) might involve limiting the software to perform aggregation in iterative batches with a reasonable number of files open simultaneously.
I'm not sure if this is an intended use case for Neptune, but I attempted to run the program with ~150 inclusion genomes (450 MB) and ~8000 exclusion genomes (32 GB) and it caused the program to crash before completion. Here is the log from my console:
The text was updated successfully, but these errors were encountered: