Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swag*': No such file or directory #2

Open
jnarayan81 opened this issue Aug 28, 2018 · 3 comments
Open

swag*': No such file or directory #2

jnarayan81 opened this issue Aug 28, 2018 · 3 comments
Assignees

Comments

@jnarayan81
Copy link

Some swag* files, seems missing !

➜  ELECTOR git:(master) ✗ python3 elector.py -perfect /media/urbe/ARCgenomic/toyGenome/toy.fasta -corrected /home/urbe/Tools/canu/Linux-amd64/bin/simTest20x/simTest20x.correctedReads.fasta -uncorrected /media/urbe/ARCgenomic/toyGenome/toyLongReads.fasta
- Mean that a large amount of nuc has been handled: 100000000
**-rm: cannot remove '/home/urbe/Tools/ELECTOR/swag*': No such file or directory**

Traceback (most recent call last):
  File "elector.py", line 168, in <module>
    main()
  File "elector.py", line 136, in main
    nbReads, throughput, precision, recall, correctBaseRate, errorRate, smallReads, wronglyCorReads, percentGCRef, percentGCCorr, numberSplit, meanMissing, numberExtended, meanExtension, minLength, indelsubsUncorr, indelsubsCorr , homoInsU, homoDeleU, homoInsC,  homoDeleC, homoInsUMean,  homoDeleUMean, homoInsCMean, homoDeleCMean = computeStats.outputRecallPrecision(sortedCorrectedFileName, outputDirPath, logFile, smallReads, wronglyCorReads, reportedHomopolThreshold, size_corrected_read_threshold, 0, 0, soft)
  File "/home/urbe/Tools/ELECTOR/computeStats.py", line 159, in outputRecallPrecision
    nbReads, throughput, precision, recall, corBasesRate, errorRate, extendedBases, missingSize,  GCRateRef, GCRateCorr, indelsubsUncorr, indelsubsCorr, numberHomopolymersInserInCorrected, numberHomopolymersDeleInCorrected , numberHomopolymersInserInUncorrected ,	numberHomopolymersDeleInUncorrected,	meanLengthDeleHomopolymersInUncorrected , meanLengthInserHomopolymersInUncorrected , 	meanLengthInserHomopolymersInCorrected ,	meanLengthDeleHomopolymersInCorrected  = computeMetrics(outDir + "/msa.fa", outMetrics, correctedFileName, reportedHomopolThreshold)
  File "/home/urbe/Tools/ELECTOR/computeStats.py", line 448, in computeMetrics
    upperCasePositions = getUpperCasePositions(correctedReadsList, lines)
  File "/home/urbe/Tools/ELECTOR/computeStats.py", line 624, in getUpperCasePositions
    upperCasePositions[-1] = [False] * len(correctedMsa)
**IndexError: list assignment index out of range**

I tried this as well, but ended with following error

➜ ELECTOR git:(master) ✗ python3 elector.py -perfect /media/urbe/ARCgenomic/toyGenome/toy.fasta -uncorrected /media/urbe/ARCgenomic/toyGenome/toyLongReads.fasta -corrected /home/urbe/Tools/canu/Linux-amd64/bin/simTest20x/simTest20x.correctedReads.fasta -threads 40 -split -corrector canu

  • Mean that a large amount of nuc has been handled: 100000000
    -rm: cannot remove '/home/urbe/Tools/ELECTOR/swag*': No such file or directory

Traceback (most recent call last):
File "elector.py", line 168, in
main()
File "elector.py", line 136, in main
nbReads, throughput, precision, recall, correctBaseRate, errorRate, smallReads, wronglyCorReads, percentGCRef, percentGCCorr, numberSplit, meanMissing, numberExtended, meanExtension, minLength, indelsubsUncorr, indelsubsCorr , homoInsU, homoDeleU, homoInsC, homoDeleC, homoInsUMean, homoDeleUMean, homoInsCMean, homoDeleCMean = computeStats.outputRecallPrecision(sortedCorrectedFileName, outputDirPath, logFile, smallReads, wronglyCorReads, reportedHomopolThreshold, size_corrected_read_threshold, 0, 0, soft)
File "/home/urbe/Tools/ELECTOR/computeStats.py", line 155, in outputRecallPrecision
nbReads, throughput, precision, recall, corBasesRate, errorRate, extendedBases, missingSize, GCRateRef, GCRateCorr, indelsubsUncorr, indelsubsCorr, numberHomopolymersInserInCorrected, numberHomopolymersDeleInCorrected , numberHomopolymersInserInUncorrected , numberHomopolymersDeleInUncorrected, meanLengthDeleHomopolymersInUncorrected , meanLengthInserHomopolymersInUncorrected , meanLengthInserHomopolymersInCorrected , meanLengthDeleHomopolymersInCorrected = computeMetrics(outDir + "/msa_" + soft + ".fa", outMetrics, correctedFileName, reportedHomopolThreshold )
File "/home/urbe/Tools/ELECTOR/computeStats.py", line 444, in computeMetrics
msa = open(fileName, 'r')
FileNotFoundError: [Errno 2] No such file or directory: '/home/urbe/Tools/ELECTOR/msa_canu.fa'

@jnarayan81 jnarayan81 changed the title cannot remove '/home/urbe/Tools/ELECTOR/swag* swag*': No such file or directory Aug 28, 2018
@morispi morispi self-assigned this Sep 10, 2018
@morispi
Copy link
Collaborator

morispi commented Sep 10, 2018

Hi,

Sorry for the late answer.

I can see you used "-perfect /media/urbe/ARCgenomic/toyGenome/toy.fasta". Is your "toy.fasta" file an actual genome? If so, this is most probably why you are getting this error.

When using the "-perfect" option, ELECTOR is assuming the input file represents "perfect / reference" reads, that are, reads without sequencing errors. If you wish to directly provide a genome to ELECTOR, and let it do the job of finding the perfect / reference reads itself, you should use the "-reference" option instead.

Please tell me if that helps!

Pierre

@jnarayan81
Copy link
Author

Hi @morispi
Thanks for reply.
I tried with -reference on example data
python3 elector.py -reference example/example_reference.fasta -corrected example/corrected_reads.fasta -uncorrected Sim -threads 46 -simulator simlord -output test2

but it seems busy doing something for last 12 hour ... which I think is not normal !
Is that command right ? Or did I missed something ?

@morispi
Copy link
Collaborator

morispi commented Sep 13, 2018

Hey,

If you wish to use the SimLoRD simulated reads from the toy example along with the -reference switch, you should run this command, as indicated in the README:

python3 elector.py -reference example/example_reference.fasta -corrected example/Simlord/correctedReads.fasta -uncorrected example/Simlord/simulatedReads -simulator simlord

Indeed, the reads from the -corrected and from the -uncorrected switches have to have matching headers. It seems like what you provided you the -uncorrected switch (Sim) is an unexisting file, hence the program never stopping.

Quick use guide of the parameters:

For simulated reads:

-corrected LR.fasta: here you must provide a file of corrected long reads in fasta format
-uncorrected SimLRPrefix: here you must provide the prefix of the simulated reads files. They must be the original reads for which the correction has been provided to the -corrected switch. For example, if you simulated a set of reads with SimLoRD in the test/simulation/ directory, this directory will contain the following files: simReads.h5, simReads.fastq, and simReads.fastq.sam. You must thus provide test/simulation/simReads (the common prefix of all the files) to the -uncorrected witch.
-reference ref.fasta: here you must provide a file containing the reference genome in fasta format
-simulator name: here you must provide the name of the simulator that was used to simulate the read (we currently only support SimLoRD and NanoSim)
-corrector name: here you must provide the name of the correction method that was used to correct the reads (see list of supported correctors in the README)

For real reads:

-corrected LR.fasta: here you must provide a file of corrected long reads in fasta format
-uncorrected rawReads.fasta: here you must provide a file of uncorrected long reads in fasta format. They must be the original reads for which the correction has been provided to the -corrected switch.
-reference ref.fasta: here you must provide a file containing the reference genome in fasta format
-corrector name: here you must provide the name of the correction method that was used to correct the reads (see list of supported correctors in the README)

Please tell me if that helps, or if it's clear enough.
If you don't manage to run ELECTOR on the datasets you want, just describe me what you want to do, and I'll help providing you the command line.

Cheers,
P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants