Trouble with using 'circ_quant' function (CLEAR with STAR Alignment) #23

jennynuyirs · 2024-06-16T04:46:16Z

Hello! I am having some trouble getting the circ_quant function to work. My code is as follows:

circ_quant -c "$name/circRNA_out/circularRNA_known.txt" -b "$name/Aligned.sortedByCoord.out.bam" -r "$ref_genome.ref.txt" -o "$name.circRNA_quant.txt"

It produces the error AttributeError: ‘list’ object has no attribute ‘split’ (line 83 of circ_quant.py). It seems like the BAM file input is having trouble being split because the elements are not strings, but I'm skeptical this is actually the case because fixing it would require changing the source code (probably not a good idea).

I am fairly new to bioinformatics and only somewhat experienced with coding, so I'm unsure how to proceed from here. Any potential solutions or suggestions for debugging would be immensely helpful.

I've included the full pipeline below, which is a slightly modified version of @bounlu 's CLEAR with STAR Alignment pipeline. I've tested all the steps separately, which work as they should except the very last circ_quant step.

# define parameters
file_extension="_R1_001.fastq.gz"
read_length=100
ref_genome="hg38"

# make output directories
mkdir "STAR_$ref_genome"
mkdir "STAR_$ref_genome/$read_length"

# download reference files
fetch_ucsc.py "$ref_genome" fa "$ref_genome.fa"
fetch_ucsc.py "$ref_genome" ref "$ref_genome.ref.txt"
cut -f2-11 "$ref_genome.ref.txt" | genePredToGtf file stdin "$ref_genome.ref.gtf"

# generate genome index file
STAR --runMode genomeGenerate --genomeDir "STAR_$ref_genome/$read_length" --limitIObufferSize 1000000000 --runThreadN 16 --genomeFastaFiles "$ref_genome.fa" --outFileNamePrefix ./ --sjdbGTFfile "$ref_genome.ref.gtf" --sjdbOverhang "$(($read_length-1))"

# run pipeline
for read1 in $(ls *$file_extension);
do
        name="${read1%$file_extension}"
        read2="${name}_R2_001.fastq.gz"
        mkdir -p "$name"
        STAR --chimSegmentMin 20 --runThreadN 16 --genomeLoad LoadAndRemove --limitBAMsortRAM 50000000000 --limitIObufferSize 1000000000 --outSAMtype BAM SortedByCoordinate --readFilesCommand zcat --outFileNamePrefix "$name/" --genomeDir "STAR_$ref_genome/$read_length" --readFilesIn "$read1" "$read2" > "$name/$name.circRNA_alignment.log" 2>&1
        samtools index "$name/Aligned.sortedByCoord.out.bam"
        fast_circ.py parse -r "$ref_genome.ref.txt" -g "$ref_genome.fa" -t STAR -o "$name/circRNA_out" "$name/Chimeric.out.junction" > "$name/$name.circRNA_parse.log" 2>&1
        circ_quant -c "$name/circRNA_out/circularRNA_known.txt" -b "$name/Aligned.sortedByCoord.out.bam" -r "$ref_genome.ref.txt" -o "$name.circRNA_quant.txt" > "$name/$name.circRNA_quant.log" 2>&1
done

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trouble with using 'circ_quant' function (CLEAR with STAR Alignment) #23

Trouble with using 'circ_quant' function (CLEAR with STAR Alignment) #23

jennynuyirs commented Jun 16, 2024

Trouble with using 'circ_quant' function (CLEAR with STAR Alignment) #23

Trouble with using 'circ_quant' function (CLEAR with STAR Alignment) #23

Comments

jennynuyirs commented Jun 16, 2024