Add multiqc #14

rroutsong · 2023-10-25T21:18:51Z

Add multiqc into ngsqc pipeline.

Addresses #12 #7

…es, fix dry run command

…nto add_kaiju_kraken

rroutsong · 2023-10-25T21:34:00Z

@skchronicles , example report at /data/RTB_GRS/dev/Dmux/test_ngsqc2/GRS_0212_Bhasym/230907_NS500353_0215_AHLTNVBGXM/multiqc/Run-230907_NS500353_0215_AHLTNVBGXM-Project-GRS_0212_Bhasym_multiqc_report.html

skchronicles · 2023-10-30T18:09:46Z

src/Dmux/workflow/ngs_qaqc/fastq.smk

@@ -42,7 +41,6 @@ rule fastq_screen:
        subset              = 1000000,
        aligner             = "bowtie2",
        output_dir          = lambda w: config['out_to'] + "/" + w.project + "/" + config['run_ids'] + "/" + w.sid + "/fastq_screen/",
-    # container: "docker://rroutsong/dmux_ngsqc:0.0.1",
    containerized: "/data/OpenOmics/SIFs/dmux_ngsqc_0.0.1.sif"


We need to add an option to point to a sif cache and dynamically resolve one of the following: a local SIF on the file-system or a URI to pull an image from Dockerhub.

I have a solution to this issue in the next coming PR. I have serialized the server-centric SIF directories and dynamically adding the specific server configuration at initialization time.

Ends up like:

containerized: server_config["sif"] + "dmux_ngsqc_0.0.1.sif"

SIF cache is always specified at execution time through environmental variables and subprocess.

bin/dmux.py

skchronicles · 2023-10-30T18:13:49Z

src/Dmux/workflow/ngs_qaqc/fastq.smk

@@ -98,7 +100,6 @@ rule kraken_annotation:
        kraken_log          = config['out_to'] + "/{project}/" + config['run_ids'] + "/{sid}/kraken/{sid}.log",
    params:
        kraken_db           = "/data/OpenOmics/references/Dmux/kraken2/k2_pluspfp_20230605"


We need a method to dynamically resolve the reference files.

Also addressed this in the next PR. I just kind of saved all the server resolution methods until I moved onto bigsky.

skchronicles · 2023-10-30T18:15:56Z

src/Dmux/workflow/ngs_qaqc/qc.smk

+    log: config['out_to'] + "/.logs/" + config['projects'] + "/" + config['run_ids'] + "/multiqc/multiqc.log"
+    shell:
+        """
+        multiqc -q -ip \


At some point, we may want to point to a MutliQC config file to clean up the general statistics table, create two sections for fastqc, and create a preferred module order in the final report.

This is outlined in #15

skchronicles · 2023-10-30T18:25:35Z

src/Dmux/workflow/ngs_qaqc/fastq.smk

We need to change how adapter sequences are being removed. Currently, there is a bug where the barcode sequences from Illumina's sample sheet (i7/i5) sequences are being passed to fastqc and fastp. These barcode sequences should be removed after bcl2fastq step and do not represent traditional library-prep-kit-specific adapter sequences that need to removed. With that being said, let's make use of fastp's auto-detect-adapter-sequences feature to remove them. We can also make use of fastqc's internal contaminates/adapters list to identify sequencing adapters.

Here's fastp rule in new branch master_job_and_bigsky:

shell: """ fastp \ --detect_adapter_for_pe \ --in1 {input.in_read1} --in2 {input.in_read2} \ --out1 {output.out_read1} \ --out2 {output.out_read2} \ --html {output.html} \ --json {output.json} \ """

Fastqc:

shell: """ mkdir -p {params.output_dir} fastqc -o {params.output_dir} -t {threads} {input.samples} """

FastQC before trim depends on demuxed reads, after trimmed depends on trimmed reads file.

skchronicles · 2023-10-31T20:04:38Z

Will address some of these comments/issues in the next PR.

rroutsong added 30 commits October 5, 2023 11:04

add ngs_qa_qc dockerfile and ep

bfcac3e

begin restructuring of workflows for extension of demux workflow

61c62cc

merge in dry_run action changes

80d48c0

start ngs qc/qa pipeline, fastqc - trimmed/untrimmed + fastp trimming

726c174

ngsqc fastqc, trimming, fastqc again after trim

88d43eb

fix line endings on ep.sh

b91ad97

add in fastq screen config to docker

b0f0407

final ngsqc dockerfile

5a4b4fe

feat: expand ngs qc workflow

3fa9e92

feat: python module support for ngs qc-qa

0f5d1a0

chore: ignore jsons, lower latentcy wait for biowulf

0101883

feat: finalize first half of ngsqc pipeline

b8a3b48

fix: snakemake pathing correction

64252ef

fix: dry run action needs -s kwarg

160e368

chore: fix args in github action

8034709

chore: refactor setuptools package, force include workflow and profil…

b95ff7b

…es, fix dry run command

chore: dry run command not being executed

889554d

feat: kaiju and kraken annotation rules, beginning

a9d1fd0

chore: merge in conda2src changes, drop using conda

a6e5550

chore: fix more merge conflicts

52558fc

feat: working kraken & kaiju

797119a

chore: remap outputs to discussed structure

63fd0a9

fix: align io paths for ngsqc workflow

82b690e

chore: relocate slurm logs directory

bdb8c1d

fix: correct fastqc_trimmed output paths

2d74018

fix: dry run action, add path to env

bc51fb2

fix: broken path in dry run action

34a33cd

fix: cat from /Users/routsongrm/git/NGS/Dmux

c063c04

fix: cat from \$PWD/NGS/Dmux

ab25f77

Merge remote-tracking branch 'refs/remotes/origin/add_kaiju_kraken' i…

25cbc30

…nto add_kaiju_kraken

feat: add multiqc report

c86aff4

rroutsong requested a review from skchronicles October 25, 2023 21:18

rroutsong assigned rroutsong and jlac Oct 25, 2023

rroutsong added 2 commits October 25, 2023 17:23

chore: merge in main, fix mutliqc output report name

2ced134

fix: new flag for dry run in CI

e787239

skchronicles requested changes Oct 30, 2023

View reviewed changes

skchronicles approved these changes Oct 31, 2023

View reviewed changes

skchronicles merged commit 2ff4353 into main Oct 31, 2023
1 check passed

rroutsong deleted the add_multiqc branch November 17, 2023 16:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multiqc #14

Add multiqc #14

rroutsong commented Oct 25, 2023

rroutsong commented Oct 25, 2023

skchronicles Oct 30, 2023

rroutsong Oct 31, 2023

skchronicles Oct 30, 2023

rroutsong Oct 31, 2023

skchronicles Oct 30, 2023

rroutsong Oct 31, 2023

skchronicles Oct 30, 2023

rroutsong Oct 31, 2023

skchronicles commented Oct 31, 2023

Add multiqc #14

Add multiqc #14

Conversation

rroutsong commented Oct 25, 2023

rroutsong commented Oct 25, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skchronicles commented Oct 31, 2023