Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Barcode Sequencing Errors #12

Open
Austin-s-h opened this issue May 23, 2019 · 2 comments
Open

Barcode Sequencing Errors #12

Austin-s-h opened this issue May 23, 2019 · 2 comments

Comments

@Austin-s-h
Copy link

Hello again, I was wondering if you had a more elegant way of handling barcode sequencing errors. Currently, my snap object reports I have 85,228 unique barcodes. However, I know the number of true barcodes is 670.
Since I thought this was the result of sequencing errors in the barcode, I made a custom python script that adds the barcode to the front of the read header (because I'm starting from demultiplex'd fastq's) and any invalid barcodes (barcodes with >1bp mismatch) are renamed to INVALID.
When I perform simple filtering in SnapATAC (R), for a UMI count of 500 and a mit.ratio of <0.3 I lose almost all of these invalid barcodes (since there is a low chance of a sequencing error barcode having 500+ unique reads) and end up with ~1500 barcodes.
Is there a better way to handle these errors? Should I perform more stringent filtering? Here is what my current distribution looks like.

image

Thanks!

@Austin-s-h
Copy link
Author

Actually, I followed through the tutorial just a little bit more and found that the UMI cutoff should be much higher, after that, it is 580 barcodes, which is exactly in the range I expected. Thanks!

@r3fang
Copy link
Owner

r3fang commented May 23, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants