-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analysis of MiSeq and iSeq fastq files using DADA2 #1083
Comments
Wow, awesome set of analyses and Github repository, thanks for that work! Based on what you see there, I think it confirms what I expect, which is that DADA2 will largely work OK with iSeq type quality scores. That said, there is some concern that denoising error rates might be moderately higher, in particular there might be a higher number of false-positive rare ASVs, in iSeq data. This is for two reasons, first the binned quality scores have less information which makes accurate denoising more difficult, and DADA2's error model fitting procedure was built for "normal" Miseq quality scores distributions, and can be non-ideal for binned quality scores. This has been discussed before and there is quite a bit of useful information in some other threads on this issue: #791 I do think two additional simple diagnostics could be useful, what does the output of |
Thank you so much for your reply. Outputs of plotErrors look like follows (these are also available at "03_SeqAnalysisDADA2_xxxOut" in the repository): MiSeq error plot iSeq error plot As in #791, estimated error rates decrease sharply at around Q=30-35. Also, I have checked histograms of ASV read counts as well as ASV relative abundance. There is no big difference between MiSeq and (simulated) iSeq data. Slightly more rare ASVs are found in MiSeq data in terms of relative abundance (bottom panel), but this is probably because greater read counts of relatively abundance taxa derived from MiSeq data (top panel). Analyzing iSeq data with DADA2 looks fine at least when we are interested in obtaining general overview of microbial communities. |
Yeah, given the analyses you've shown here, I feel pretty good about that conclusion as well. |
Hi DADA2 developers,
I have been using MiSeq so far, but recently my group bought iSeq and try to analyze iSeq sequence data by DADA2. iSeq generates basically the same outputs as MiSeq does, but I found the quality scores (Q-scores) are very different. MiSeq fastq file contains 0-39 Q-scores, but iSeq fastq file contains only three Q-scores (11, 25, 37).
DADA2 can run with the iSeq fastq files, but I am wondering whether analyzing iSeq data using DADA2 is appropriate or not. To briefly examine the effects of the different Q-scores, I have performed several analyses using my own sequence data (scripts and results are a bit long, so I posted them in my Github repository: https://github.com/ong8181/random-scripts/tree/master/04_MiSeq_vs_iSeq_DADA2)
General procedure of my test is as follows:
I guess that, based on the results of my analysis and the algorithm of DADA2, analyzing iSeq data should be fine, but I would be glad if you could give me your thoughts on this issue.
Best regards,
Ushio
The text was updated successfully, but these errors were encountered: