Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use IDBA-UD to perform scaffolding on assembled contigs? #63

Open
rzhan186 opened this issue Apr 27, 2021 · 5 comments
Open

How to use IDBA-UD to perform scaffolding on assembled contigs? #63

rzhan186 opened this issue Apr 27, 2021 · 5 comments

Comments

@rzhan186
Copy link

Dear IDBA-UD developers,

I came across some literature where people have used IDBA-UD to perform a scaffolding step on assembled contigs from other assemblers such as MEGAHIT. It's not quite clear how it's supposed to be done from reading the IDBA-UD help page, and I am thinking of using the following code to perform the task, but not sure if it's appropriate.
idba --read merged_raw_reads.fa --read_level_2 megahit_contigs.fa --out idba_scaffolds.fa

Could you help with this? Much appreciated!

Rui

@jarrodscott
Copy link

Hi @rzhan186

Did you ever find an answer to this question? I came across this idea in a paper by He et al where they state. Assembled contigs were then scaffolded using the scaffolding function from IDBA-UD63 (v.1.1.3). If you found a way of doing this I would your insight :)

@rzhan186
Copy link
Author

Hi @jarrodscott, I actually read the same paper 😂 then I emailed the author for clarification. The author said I have to use this script from IDBA-UD. So what I did is just clone the whole IDBA repository, then change to the bin directory and perform the following:

./scaffold -o $out_dir contigs.fa reads_paired.fa --num_threads 1

However, when I check the resulting scaffolds' statistics, they were exactly the same as the contigs', I was hoping it could improve the assembly quality, but it didn't work on my data. Feel free to try it out, let me know how it goes! (I might have just done something wrong...)

@jarrodscott
Copy link

jarrodscott commented Aug 20, 2021

Thanks @rzhan186 !!! Curious, I just ran this test using a single sample (R1 & R2 fastq files merged with fq2fa) on a small co-assembly of 4 samples (generated with MEGAHIT) and the result was also the same. Perhaps I am missing something?

@rzhan186
Copy link
Author

Hi @jarrodscott thanks for sharing your result! Yeah, this is a bit strange. I am not sure if you've tried to run IDBA-UD from scratch on your raw reads? From what I know, the software outputs both contigs and scaffolds, maybe you can run it on your raw reads and compare it with the MEGAHIT results, then move on with the best one.

I hope the author can pop up someday and clear up our confusion 😆

@jarrodscott
Copy link

jarrodscott commented Aug 20, 2021

Hi @rzhan186. Good question. Indeed, I have tried running IDBA-UD from scratch on the raw reads. Once I filter the scaffolds output from IDBA_UD and remove sequences < 1kbs, the results were comparable to the MEGAHIT meta-sensitive assembly (also with min length set at 1kbps). It certainly would be nice to find a hybrid approach to improve assemblies :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants