Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ciri failing even with 3 retries #128

Open
kelly-sovacool opened this issue Oct 28, 2024 · 2 comments · May be fixed by #130
Open

ciri failing even with 3 retries #128

kelly-sovacool opened this issue Oct 28, 2024 · 2 comments · May be fixed by #130
Labels
bug Something isn't working HighPriority

Comments

@kelly-sovacool
Copy link
Member

job failed 3 times

grep ciri.sample=GI1_T logs/snakemake.log.jobby.short 
FAILED  /vf/users/CCBR/charlie_test_wil/charlie2/logs/39273613.39292144.ciri.sample=GI1_T.err
FAILED  /vf/users/CCBR/charlie_test_wil/charlie2/logs/39273613.39292229.ciri.sample=GI1_T.err
FAILED  /vf/users/CCBR/charlie_test_wil/charlie2/logs/39273613.39292243.ciri.sample=GI1_T.err

1st time -- uninformative error message:

Activating singularity image /vf/users/CCBR/charlie_test_wil/charlie2/.snakemake/singularity/db86f96b6c7474c17f67edda4e3fa07b.simg
[Fri Oct 25 15:28:38 2024]
Error in rule ciri:
    jobid: 0
    input: /data/CCBR/charlie_test_wil/charlie2/ref/ref.bwt, /data/CCBR/charlie_test_wil/charlie2/results/GI1_T/trim/GI1_T.R1.trim.fastq.gz, /data/CCBR/charlie_test_wil/charlie2/results/GI1_T/trim/GI1_T.R2.trim.fastq.gz, /data/CCBR/charlie_test_wil/charlie2/ref/ref.fixed.gtf
    output: /data/CCBR/charlie_test_wil/charlie2/results/GI1_T/ciri/GI1_T.ciri.log, /data/CCBR/charlie_test_wil/charlie2/results/GI1_T/ciri/GI1_T.bwa.log, /data/CCBR/charlie_test_wil/charlie2/results/GI1_T/ciri/GI1_T.ciri.bam, /data/CCBR/charlie_test_wil/charlie2/results/GI1_T/ciri/GI1_T.ciri.out, /data/CCBR/charlie_test_wil/charlie2/results/GI1_T/ciri/GI1_T.ciri.out.filtered
    shell:

last line in .out file:

[Fri Oct 25 15:28:15 2024] Extracting info from temporary files

2nd time - same message as first

3rd time - file not found:

Traceback (most recent call last):
  File "/gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CHARLIE/.v0.11.1/workflow/scripts/filter_ciriout.py", line 110, in <module>
    infile  = open(args.ciriout,'r')
FileNotFoundError: [Errno 2] No such file or directory: '/data/CCBR/charlie_test_wil/charlie2/results/GI1_T/ciri/GI1_T.ciri.out'
@kelly-sovacool
Copy link
Member Author

kelly-sovacool commented Nov 6, 2024

another charlie job with 3 ciri fails: /home/sovacoolkl/data/charlie_test_11-05

grep -i 'error exec' snakemake.log

Error executing rule star_circrnafinder on cluster (jobid: 36, external: 40036616, jobscript: /gpfs/gsfs12/users/sovacoolkl/charlie_test_11-05/.snakemake/tmp.4udzu_t6/snakejob.star_circrnafinder.36.sh). For error details see the cluster log and the log files of the involved rule(s).
Error executing rule ciri on cluster (jobid: 24, external: 40036618, jobscript: /gpfs/gsfs12/users/sovacoolkl/charlie_test_11-05/.snakemake/tmp.4udzu_t6/snakejob.ciri.24.sh). For error details see the cluster log and the log files of the involved rule(s).
Error executing rule ciri on cluster (jobid: 24, external: 40036623, jobscript: /gpfs/gsfs12/users/sovacoolkl/charlie_test_11-05/.snakemake/tmp.4udzu_t6/snakejob.ciri.24.sh). For error details see the cluster log and the log files of the involved rule(s).
Error executing rule ciri on cluster (jobid: 24, external: 40037074, jobscript: /gpfs/gsfs12/users/sovacoolkl/charlie_test_11-05/.snakemake/tmp.4udzu_t6/snakejob.ciri.24.sh). For error details see the cluster log and the log files of the involved rule(s).

They seem to be dying in the middle of the process for some reason.

first run, last lines of logs/40032921.40036618.ciri.sample=GI1_N.out:

[Tue Nov  5 14:59:44 2024] Extracting info from temporary files
 Additional candidate reads found: 12
 Additional candidate reads with PEM signals: 11
[Tue Nov  5 14:59:45 2024] Summarizing
 Number of circular RNAs found: 14
[Tue Nov  5 15:05:58 2024] CIRI finished its work. Please see output file /data/sovacoolkl/charlie_test_11-05/results/GI1_N/ciri/GI1_N.ciri.out for detail.

last run, last lines of logs/40032921.40037074.ciri.sample=GI1_N.out:

Cannot split GI1_N.bwa.sam into 56 (55) pieces with size of 12446116 and named them as /data/sovacoolkl/charlie_test_11-05/results/GI1_N/ciri/GI1_N.bwa.sam.
Fatal error. Aborted.

@kelly-sovacool
Copy link
Member Author

kelly-sovacool commented Nov 6, 2024

maybe this is a latency issue? trying a new run with --latency-wait 300 at /data/sovacoolkl/charlie_test_11-06_latency-wait

@kelly-sovacool kelly-sovacool linked a pull request Nov 7, 2024 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working HighPriority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant