Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QC outputs #11

Open
cjfields opened this issue Nov 11, 2020 · 2 comments
Open

QC outputs #11

cjfields opened this issue Nov 11, 2020 · 2 comments

Comments

@cjfields
Copy link
Contributor

No description provided.

@grendon
Copy link
Collaborator

grendon commented Nov 30, 2020

Most assemblies have metrics similar to this one:

---------------- Information for assembly 'NA19028.final.megahit_results/final.contigs.fa' ----------------


                                         Number of scaffolds        119
                                     Total size of scaffolds      91508
                                            Longest scaffold       4962
                                           Shortest scaffold        310
                                 Number of scaffolds > 1K nt         22  18.5%
                                Number of scaffolds > 10K nt          0   0.0%
                               Number of scaffolds > 100K nt          0   0.0%
                                 Number of scaffolds > 1M nt          0   0.0%
                                Number of scaffolds > 10M nt          0   0.0%
                                          Mean scaffold size        769
                                        Median scaffold size        601
                                         N50 scaffold length        747
                                          L50 scaffold count         34
                                                 scaffold %A      29.98
                                                 scaffold %C      20.00
                                                 scaffold %G      20.43
                                                 scaffold %T      29.59
                                                 scaffold %N       0.00
                                         scaffold %non-ACGTN       0.00
                             Number of scaffold non-ACGTN nt          0

                Percentage of assembly in scaffolded contigs       0.0%
              Percentage of assembly in unscaffolded contigs     100.0%
                      Average number of contigs per scaffold        1.0
Average length of break (>25 Ns) between contigs in scaffold          0

                                           Number of contigs        119
                              Number of contigs in scaffolds          0
                          Number of contigs not in scaffolds        119
                                       Total size of contigs      91508
                                              Longest contig       4962
                                             Shortest contig        310
                                   Number of contigs > 1K nt         22  18.5%
                                  Number of contigs > 10K nt          0   0.0%
                                 Number of contigs > 100K nt          0   0.0%
                                   Number of contigs > 1M nt          0   0.0%
                                  Number of contigs > 10M nt          0   0.0%
                                            Mean contig size        769
                                          Median contig size        601
                                           N50 contig length        747
                                            L50 contig count         34
                                                   contig %A      29.98
                                                   contig %C      20.00
                                                   contig %G      20.43
                                                   contig %T      29.59
                                                   contig %N       0.00
                                           contig %non-ACGTN       0.00
                               Number of contig non-ACGTN nt          0

I still see some scaffolds that could be removed because they are artifacts like this one with low complexity sequences:

>k141_37 flag=1 multi=5.0000 len=359
CGGGGAGAGGGGGGTAGAAGTGGGAGGAGGGAGAAACAGAAAAAAAGAGAGAGAAAAACAAAGAGGTGAGAGGGAGGAGAGAGACAGAGGGAGAGAGGTGAGGGGGAGAGAAACAGAGAAAATGGGAGGTGGAGGGGAGAGAGAGAGGAGAGAGAGAAACAGAGGGAGAGAGAGAGGTGGGGGAGAGACAGGAGAGAGAGGTAAGCGGGGAGAGAGAAAAACAGGGAGAGAGGTTGGGGGTTGAGGGAGAGACAGAGAAACAGGGAGAGAGAGGCGGGAAGAGGTGGGAGAAGACACAGAAAAAACAGAGAAAATGAGAAAGAAAAGAGACAGGGTGGGGGAGAGAGAGAGGGAGAGAG

@cjfields
Copy link
Contributor Author

We will have a separate QC workflow for these steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants