Number of barcodes doesn't match with cellranger output. #23

NKleinenkuhnen · 2020-01-10T20:33:33Z

Hey,

Thank you very much for your awesome tool! I recently ran into the problem while trying to create a snap file from the output of cellranger. So far I tried to entry points: 1. the position sorted bam file 2. the fragment tsv file.
However, in both approaches I ended up with way more barcodes in my snap file than I got in the result report from 10x. In scenario 1 I get 40k barcodes and in scenario 2 20k. According to the 10x summary the dataset should contain 8199 cells.
I followed your excellent step-by-step tutorial (https://github.com/r3fang/SnapATAC/wiki/FAQs#10X_snap) and just copied the commands and changed the filenames. I worked with Python 3.7 and the latest version of SnapTools on my Mac.
Importing the snap file into R and processing it works like a charm but I couldn't solve the barcode issue myself.
I should note that in scenario 1 I had two samples which I processed separately with the same commands and then merged them via createSnap. I hope I could provide you enough information. If you need more just let me know.
Thanks in Advance!

mej54 · 2020-01-27T16:02:39Z

Hi there,

I've also noticed a difference between the barcodes based on CellRanger output. I've been using snap-pre with possorted_bam.bam from CellRanger to create snap files as outlined:

cat <( cat $DIR/$SAMPLE.header.sam ) \
<( samtools view $BAM | awk '{for (i=12; i<=NF; ++i) { if ($i ~ "^CB:Z:"){ td[substr($i,1,2)] = substr($i,6,length($i)-5); } }; printf "%s:%s\n", td["CB"], $0 }' ) \
| samtools view -bS - > $DIR/$SAMPLE.snap.bam

samtools sort -n -@ 10 -m 1G $DIR/$SAMPLE.snap.bam -o $DIR/$SAMPLE.snap.nsrt.bam

When I was looking into the promoter ratio using the single_cell.csv files, I noticed there were barcodes in the snap files that were not in the single_cell.csv files (which I believe should contain all fragments). I looked into this further and I'm wondering if at some point snap-pre is taking information from the "CR:Z" flag in the bam instead of the error-corrected barcodes "CR:B"? When I search the barcodes in the snap file that weren't found in the single_cell.csv file, they match to the barcodes under CR:Z, not CR:B (even though the CR:B barcode was added to the read name as outline above).

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Number of barcodes doesn't match with cellranger output. #23

Number of barcodes doesn't match with cellranger output. #23

NKleinenkuhnen commented Jan 10, 2020

mej54 commented Jan 27, 2020

Number of barcodes doesn't match with cellranger output. #23

Number of barcodes doesn't match with cellranger output. #23

Comments

NKleinenkuhnen commented Jan 10, 2020

mej54 commented Jan 27, 2020