Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set minLen during filterAndTrim #30

Open
taramclancy opened this issue Mar 26, 2019 · 3 comments
Open

set minLen during filterAndTrim #30

taramclancy opened this issue Mar 26, 2019 · 3 comments

Comments

@taramclancy
Copy link
Contributor

when I ran through the pipeline with my data some reads made it through even though they were quite short (as low as 145 bp, which was my truncLen on my forward read).

the default minLen says it is 20 which seems way too low, I'm thinking something like +/- 2 basepairs (maybe up to 5 bp) around the expected merged length might be better, given that we don't expect much variation (and for most of the reads don't actually see that much variation)

@taramclancy
Copy link
Contributor Author

I just realized that minLen is "is enforced after trimming and truncation" but NOT before merging....so I think probably setting a minLength on the sequence lengths in the seqtable???

@taramclancy
Copy link
Contributor Author

this might be a helpful way to trim the final lengths??
colnames(seqtab_showerhead_run2_nodust_240) <- substr(colnames(seqtab_showerhead_run2_nodust_240), 1, 240)

@hhollandmoritz
Copy link
Collaborator

This is something we should discuss and decide how we want to approach it. The old pipeline didn't bother trimming things, but the output did keep track of the length. Below is some code to figure out statistics about the length of each ESV and plot a histogram.

mean(nchar(colnames(seqtab)))
qplot(nchar(colnames(seqtab)), bins = 50)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants