Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can dorado mistakenly identify m6A as a result of other methylation on A? #951

Closed
Salvobioinfo opened this issue Jul 18, 2024 · 3 comments
Closed
Labels
mods For issues related to modified base calling question Issue is a question

Comments

@Salvobioinfo
Copy link

Hello,
I have KO samples for an enzyme responsible for m6A modification, and I have identified many m6A sites in these KO samples. I am currently analysing the entire sample library to quantify m6A sites in the KO samples.

Excluding KO related issues and other enzymes that could also insert m6A. Is there any possibility dorado mistakenly identifies m6A because of other methylation on A?

Run environment:

  • Dorado version: 0.7.0
  • kit SQK-RNA004
  • flow cell FLO-PRO004RA

Thanks in advance

@Theo-Nelson
Copy link

Hi Salvobioinfo,

Not a developer of Dorado, but happy to add my two cents.

The parameters of the experiment are key when quantifying m6a sites. To achieve robust decrease in m6a, the conserved catalytic domain (DPPF from https://www.nature.com/articles/s41589-018-0184-3) for most m6a writers needs to be removed. The most common writer is METTL3. Within the knockout design, you need to specifically target the exonic region corresponding to this catalytic domain to have the intended effect. You can verify via mass spec whether the total quantity of m6a in your sample as decreased.

For our data, we observe more high-confidence signals with the newest m6a model present in WT absent from knockdown samples. All-context m6a calling is challenging - you should examine what percentage of your hits fall within homopolymer regions, rRNA, or mitochondrial regions. You can thereafter with a uniform filtering criteria compare the distributions of your KO and WT samples and see whether there are any differences.

Best of luck!

Sincerely,
Theo

@Salvobioinfo
Copy link
Author

Hi Theo-Nelson,

I appreciate your response greatly. The KO samples have received confirmation from both MS analysis and NGS.
I value your advice on examining m6a distribution across RNA species, and I also want to include m6a by mRNA regions. We will perform this analysis as soon as possible, but overall the m6a situation remains really strange. The decrease in knockouts seems to be only slightly minimal in comparison to MS. It appears to be a common issue based on what we are reading in the modkit tools issues section.

Best,
Salvo

@Theo-Nelson
Copy link

Dear Salvo,

A thread that really helped me over in the modkit world is this one: nanoporetech/modkit#198

As ArtRand suggests, if you run modkit sample-probs ${modbam} --hist ${histogram_dir} --percentiles 0.1,0.8,0.85,0.9 you will get histograms (vertical sideways histograms) that you can compare to see what the unique probability threshold is.

Taking the example in the thread, the IVT (KO) vs. WT histograms look like this side-by-side for code a (the m6a code). You will get two histograms: one for m6a and one for regular A (if you just run the m6a basecalling model).

Screenshot 2024-08-01 at 9 16 17 AM

As you can see for the m6a it is the calls that are past the 99% confidence probability that are enriched in the mRNA vs. IVT comparison. This is how they arrive at the recommended filters --filter-threshold A:0.8 --mod-thresholds a:0.99

If you wish to post your histogram results, I can also advise directly here or via email [email protected]

Sincerely,
Theo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mods For issues related to modified base calling question Issue is a question
Projects
None yet
Development

No branches or pull requests

3 participants