From 3cc7d5ddbfc5892e88a06bb3b0da38b0ec26838b Mon Sep 17 00:00:00 2001 From: Jonathan Kitt <70012823+johnkitt85@users.noreply.github.com> Date: Tue, 1 Sep 2020 10:51:06 +0200 Subject: [PATCH] Update 07-Read_Processing.Rmd Fixed typos --- 07-Read_Processing.Rmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/07-Read_Processing.Rmd b/07-Read_Processing.Rmd index 2f635c4..0e415cc 100644 --- a/07-Read_Processing.Rmd +++ b/07-Read_Processing.Rmd @@ -11,7 +11,7 @@ knitr::opts_chunk$set(echo = TRUE, ``` -Advances in sequencing technology are helping researchers sequence the genome deeper than ever. These sequencing experiments typically yield millions of reads. These reads have to be further processed, quality checked and aligned before we can quantify the genomic signal of interest and apply statistics and/or machine learning methods. For example, you may want to count how many reads overlapping with your promoter set of interest or you may want to quantify RNA-seq reads overlapping with exons. Post-alignment operations are usually but not always similar to operations on genomic intervals. Dealing with mapped reads are described previously in chapter \@ref(genomicIntervals). In addition, we have introduced high-throughput sequencing and its applications in general in chapter \@ref(intro). In this chapter we will introduce the fundamentals of read processing and quality check, and we will show how to do those tasks in R. The read quality check and processing is a fundemental step in all high-throughput sequencing analyses. For example, RNA-seq, ChIP-seq and BS-seq analyses shown in Chapters \@ref(rnaseqanalysis), \@ref(chipseq) and \@ref(bsseq) require these quality check and processing steps prior to further analysis. For a long time, quality check and mapping tasks were outside the R domain. However, nowadays certain packages in R/Bioconductor can accomplish those tasks. +Advances in sequencing technology are helping researchers sequence the genome deeper than ever. These sequencing experiments typically yield millions of reads. These reads have to be further processed, quality checked and aligned before we can quantify the genomic signal of interest and apply statistics and/or machine learning methods. For example, you may want to count how many reads overlap with your promoter set of interest or you may want to quantify RNA-seq reads overlapping with exons. Post-alignment operations are usually but not always similar to operations on genomic intervals. Dealing with mapped reads is described in chapter \@ref(genomicIntervals). In addition, we have introduced high-throughput sequencing and its applications in general in chapter \@ref(intro). In this chapter we will introduce the fundamentals of read processing and quality check, and we will show how to do those tasks in R. The read quality check and processing is a fundemental step in all high-throughput sequencing analyses. For example, RNA-seq, ChIP-seq and BS-seq analyses shown in Chapters \@ref(rnaseqanalysis), \@ref(chipseq) and \@ref(bsseq) require these quality check and processing steps prior to further analysis. For a long time, quality check and mapping tasks were outside the R domain. However, nowadays certain packages in R/Bioconductor can accomplish those tasks. ## FASTA and FASTQ formats High-throughput sequencing reads are usually output from sequencing facilities as text files in a format called "FASTQ" or "fastq". This format depends on an earlier format called FASTA. The FASTA format is developed as a text-based format to represent nucleotide or protein sequences (See Figure \@ref(fig:fasta) for an example). @@ -279,4 +279,4 @@ rqcCycleQualityBoxPlot(qcRes) 2. Now we will trim the reads based on the quality scores. Let's trim 2-4 bases on the 3' end depending on the quality scores. You can use Trim the ends of the samples `QuasR::preprocessReads()` function for this purpose.[Difficulty: **Beginner/Intermediate**] -3. Align the trimmed and untrimmed reads using `QuasR` and plot alignment statistics, did the trimming improve alignments? [Difficulty: **Intermediate/Advanced**] \ No newline at end of file +3. Align the trimmed and untrimmed reads using `QuasR` and plot alignment statistics, did the trimming improve alignments? [Difficulty: **Intermediate/Advanced**]