-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert to platform-agnostic pipeline #99
Comments
test run command to modify /data/Ziegelbauer_lab/Pipelines/circRNA/v0.10.1/charlie \
-w=/data/Ziegelbauer_lab/circRNADetection/circRNA_daq_v0.10.x/samples_15 \
-m=init \
-g=hg38 \
-v=NC_009333.1,KT899744.1,NC_006273.2 \
-s /data/Ziegelbauer_lab/circRNADetection/circRNA_daq_v0.10.x/samples_15.tsv |
Created a new /data/Ziegelbauer_lab/Pipelines/circRNA/v0.10.1/charlie \
-w=/data/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_v0.10.1 \
-m=init -g=hg38 -v=NC_009333.1,KT899744.1,NC_006273.2 \
-s=/data/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/samples.tsv Currently running on biowulf with latest release so we can compare outputs to the containerized version. /data/Ziegelbauer_lab/Pipelines/circRNA/v0.10.1/charlie \
-w=/data/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_v0.10.1 \
-m=run |
Testing containerized version: /data/Ziegelbauer_lab/Pipelines/circRNA/charlie-dev-sovacool/charlie \
-w=/data/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev \
-m=init -g=hg38 -v=NC_009333.1,KT899744.1,NC_006273.2 \
-s=/data/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/samples.tsv /data/Ziegelbauer_lab/Pipelines/circRNA/charlie-dev-sovacool/charlie \
-w=/data/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev \
-m=run -g=hg38 -v=NC_009333.1,KT899744.1,NC_006273.2 \
-s=/data/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/samples.tsv |
/usr/bin/bash: line 32: fastq-filter: command not found need to add to cutadapt docker Edit: fixed and renamed the container |
|
test on FRCE /home/sovacoolkl/CHARLIE/charlie \
-w=/scratch/cluster_scratch/sovacoolkl/charlie_dev_test/charlie_iss-99 \
-m=init -g=hg38 -v=NC_009333.1,KT899744.1,NC_006273.2 \
-s=/scratch/cluster_scratch/sovacoolkl/charlie_dev_test/samples.tsv /home/sovacoolkl/CHARLIE/charlie \
-w=/scratch/cluster_scratch/sovacoolkl/charlie_dev_test/charlie_iss-99 \
-m=run -g=hg38 -v=NC_009333.1,KT899744.1,NC_006273.2 \
-s=/scratch/cluster_scratch/sovacoolkl/charlie_dev_test/samples.tsv |
error in rule DCCActivating singularity image /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/.snakemake/singularity/b688737477c8cf86b329e4227da72916.simg
+ '[' -d /lscratch/25273199 ']'
+ TMPDIR=/lscratch/25273199/09975c64-8e35-4c64-bd19-c0afbf581a78
+ '[' '!' -d /lscratch/25273199/09975c64-8e35-4c64-bd19-c0afbf581a78 ']'
+ mkdir -p /lscratch/25273199/09975c64-8e35-4c64-bd19-c0afbf581a78
++ dirname /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC/CircRNACount
+ cd /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC
+ '[' PE == PE ']'
+ DCC @/vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC/samplesheet.txt \
--temp /lscratch/25273199/09975c64-8e35-4c64-bd19-c0afbf581a78/DCC --threads 4 --detect --gene \
--bam /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/STAR2p/G1_Normal_p2.bam \
-ss \
--annotation /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/ref/ref.fixed.gtf \
--chrM -G --rep_file /data/CCBR_Pipeliner/db/PipeDB/charlie/fastas_gtfs/hg38.repeats.gtf \
--refseq /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/ref/ref.fa \
--PE-independent \
-mt1 @/vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC/mate1.txt \
-mt2 @/vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC/mate2.txt
[W::hts_idx_load3] The index file is older than the data file: /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/STAR2p/G1_Normal_p2.bam.csi
Traceback (most recent call last):
File "/usr/local/bin/DCC", line 11, in <module>
load_entry_point('DCC==0.5.0', 'console_scripts', 'DCC')()
File "/usr/local/lib/python3.8/dist-packages/DCC-0.5.0-py3.8.egg/DCC/main.py", line 490, in main
File "/usr/local/lib/python3.8/dist-packages/DCC-0.5.0-py3.8.egg/DCC/main.py", line 679, in findCircSkipJunction
File "/usr/local/lib/python3.8/dist-packages/DCC-0.5.0-py3.8.egg/DCC/Circ_nonCirc_Exon_Match.py", line 281, in findcircAdjacent
File "/usr/local/lib/python3.8/dist-packages/DCC-0.5.0-py3.8.egg/DCC/Circ_nonCirc_Exon_Match.py", line 222, in getAdjacent
ValueError: invalid literal for int() with base 10: '3"'
[Tue Apr 30 00:44:26 2024]
Error in rule dcc:
jobid: 0
input: /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC/samplesheet.txt, /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC/mate1.txt, /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC/mate2.txt, /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/STAR2p/G1_Normal_p2.bam, /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/ref/ref.fixed.gtf
output: /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC/CircRNACount, /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC/CircCoordinates, /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC/LinearCount, /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC/G1_Normal.dcc.counts_table.tsv, /vf/users/Ziegelbauer_lab/circRNADetection/sovacoolkl_charlie/charlie_dev/results/G1_Normal/DCC/G1_Normal.dcc.counts_table.tsv.filtered
shell: This worked with the previous charlie version. ( Checking for differences in input files for this rule between the two runs:The bam filessamtools stat summaries are identicalsamtools stat charlie_v0.10.1/results/G1_Tumor/STAR2p/G1_Tumor_p2.bam > G1_Tumor_p2.bam.stat.old
samtools stat charlie_dev/results/G1_Tumor/STAR2p/G1_Tumor_p2.bam > G1_Tumor_p2.bam.stat.new
diff G1_Tumor_p2.bam.stat.*
The gtf files are identicalmd5sum charlie_dev/ref/ref.fixed.gtf charlie_v0.10.1/ref/ref.fixed.gtf
The chimera files are all equallibrary(tidyverse)
files <- tibble(dev = c('charlie_dev/results/G1_Tumor/STAR1p/G1_Tumor_p1.Chimeric.out.junction',
'charlie_dev/results/G1_Tumor/STAR1p/mate1/G1_Tumor_mate1.Chimeric.out.junction',
'charlie_dev/results/G1_Tumor/STAR1p/mate2/G1_Tumor_mate2.Chimeric.out.junction'),
rel = c('charlie_v0.10.1/results/G1_Tumor/STAR1p/G1_Tumor_p1.Chimeric.out.junction',
'charlie_v0.10.1/results/G1_Tumor/STAR1p/mate1/G1_Tumor_mate1.Chimeric.out.junction',
'charlie_v0.10.1/results/G1_Tumor/STAR1p/mate2/G1_Tumor_mate2.Chimeric.out.junction'),)
files %>% pmap(\(dev, rel) all_equal(read_tsv(dev), read_tsv(rel)))
checking DCC & python version in conda env vs Dockerrelease version used conda env: CHARLIE/workflow/rules/findcircrna.smk Lines 722 to 723 in e19cd66
now using docker: Lines 12 to 16 in fbdb664
Both use v0.5.0. According to the release notes, DCC 0.5.0 requires python 3.5 and no longer supports python 2.7. I tried having the docker container install DCC via conda, but the rule still failed with the same error. still failing...After rebuilding the docker to install DCC 0.5.0 from conda, it still fails with the same error as before:
On further inspection, it looks like the DCC conda env on biowulf was built with python 2.7: |
errors on FRCE:
Will need to edit |
Looks like the DCC devs are aware of the issue and fixed it in the master branch -- https://www.github.com/dieterich-lab/DCC/issues/103 Edited the docker container to use the dev version. It worked! |
First run-through on biowulf completed successfully after several bug fixes. Re-run from start to finish completed successfully on biowulf. Test in progress on frce. |
more problems on FRCE:
need to reduce threads for FRCE, but I can't find how many are available per node on the just switched jobs that requested edit: found the FRCE hardware config here: https://ncifrederick.cancer.gov/staff/frce/documentation/frce-hardware-capabilities |
Currently running on FRCE with improved handling of config & cluster templates |
error on FRCE:
even though the file does exist 🤔 file /mnt/projects/CCBR-Pipelines/db/charlie/fastas_gtfs/hg38.fa
is Edit: this seems to be a FRCE regression -- tried to submit a RENEE job and that failed for the same reason
Submitted a help ticket |
upgraded snakemake in the shared conda env on FRCE to v7 conda activate /mnt/projects/CCBR-Pipelines/conda/envs/snakemake
mamba install -c bioconda snakemake=7.32.4 |
on FRCE, |
development in progress here:
/data/CCBR_Pipeliner/Pipelines/CHARLIE/charlie-dev-sovacool
The text was updated successfully, but these errors were encountered: