-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support of aligner ngmlr #450
Comments
We are currently in the process of adding minimap2 as a long read aligner which seems to be working in principle (feel free to clone the MM2 branch and give it a go). As such (and since I am completely unfamiliar with NGMLR), I am afraid there are no immediate plans to add yet another long read aligner at the current time. |
@FelixKrueger Thanks for your reply! I've used MM2 branch for a period of time on data. It went well with the alignment. But it seemed to produce incorrect methylation level of GCH with bismark_methylation_exctractor and coverage2cytosine. I used illumina bulk NOMe-seq data for benchmarking. Methylpy and scripts written by my colleagues gave the same close results, while this branch output a nearly 15% lower result. Hope this information can help improve the development of bismark_MM2~ |
Quick question: you say you used Illumina NOMe-seq data and experienced lower levels of methylation, i.e. accessibility, in GCH context, is that right? Illumina data doesn't really produce long reads, but I suppose it should nevertheless produce data that is comparable to data was processed with the standard pipeline (e.g. Trim Galore (clipping off the first 6-9bp), followed by non-directional single-end alignments). Did you trim the data at all when using minimap2 as the aligner? If not, can you do that ( The latest dev versions have seen some developments mainly regarding alignment speed (>100-fold speed increase for example for PacBio reads, not sure how this would hold up for SR data), so maybe you want to make sure you are on the latest version. Cheers, Felix |
@FelixKrueger First, I'm sorry that I haven't make a clear statement. Then, let me answer your quick question. 1) I used illumina 150bp bulk NOMe-seq data for benchmark, because I assumed that bismark_methylation_extractor was read length insensitive or that any bam file produced by aligners would be processed by bismark_methylation_extractor in the same way. 2) I've check the library structures before analyze the data, and trimmed the data to get clean insets. |
The methylation extractor is indeed read length insensitive, but it needs to have been processed with Bismark in the first place (and NOMe-seq is kind of special when it comes to trimming and mapping requirements (more here: https://github.com/FelixKrueger/Bismark/blob/master/Docs/README.md#optional-nome-seq-or-scnmt-seq)). You can read up on the optimisation tests for minimap2 here, if you have any suggestions I'm always happy to hear them! |
Hello,
I wonder whether another long read aligner NGMLR will be supported in the future version of Bismark, becuase some result from my colleague showed that NGMLR may have lower type I error in long read mapping.
The text was updated successfully, but these errors were encountered: