-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem interpretating table #217
Comments
Hello @MarioRinBarr, Sorry for the delayed response. Let me see if I can take on your questions one at a time.
This is the correct interpretation. One note, if you want to avoid getting these records in the future you can use
This is somewhat difficult to answer. One interpretation is this is a systematic basecalling error. In some contexts basecall errors will have biased base modification calls, for example base calling errors are sometimes more likely to be called as modified. But I wouldn't consider 16% modification as a strong indicator of this error mode. If you think it's possible that there is a sub-population of molecules that do have an
If you still see high levels of m6A, then this is probably a decent estimate of the actual frequency in the sample.
Are you using the DRACH m6A model or the all-context m6A model? The simplest explanation I have is that you're using the DRACH-motif model and the reference sequence is not DRACH, however 0.03% of the sequences have an error or mutation that does change it to DRACH and then get a modification call. Could you confirm which model you're using? Happy to continue to answer questions (and more quickly next time). |
Thank you for your answers. Thank you |
Hello @MarioRinBarr,
The |
Art, just to clarify, how does this column interact with thresholding in |
No, if you pass |
Hello,
I am writing to you because I am having trouble understanding the output of my results. I have performed a RNA sequencing with RNA004 and I want to know which sites are m6A. When I get the table, which I attach to the message, some sites are treated as A, even if the consensus sequence are not A. For example, site 9942 has 3 As of which one would be methylated. I understand that this, having such low numbers, is simply sequencing errors, a point mutation, errors in basecalling... in any case, something that should not be taken into account.
The problem is in sites like 9654, that in the consensus sequence and in the original sequence there is not an A. However the Nvalid_cov says that there are 856 sequences with A, about 10%. This could be due to a point mutation, since this is a virus. What is surprising is that over 16% of the As at this site are methylated. I don't see much sense that at a site where there is supposedly a point mutation there is so much methylation.
On the other hand, I also do not understand that in some bases there are such high Nnocall values, such as 9853, where there is a coverage of more than 10000 sequences and less than 0.03% of the bases are not included in this column. From what you explain in the page, this may be due to point mutations that prevent basecalling. However, in the original sequence and the consensus sequence they are the same, an A, and therefore this value does not make sense.
doubts.xlsx
The text was updated successfully, but these errors were encountered: