Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.md #2

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ he STAR index files
* `single`: set equal to `True` if the data is single-end, and `False` if it is paired-end. Note that currently if `single = True` it is assumed that the single read to be aligned is in the second fastq file (because of the tendancy of SICILIAN for droplet (10x) single-cell protocols in which `R1` contains the cell barcode and UMI information and R2 contains the actual cDNA information). This also causes the files to be demultiplexed to create a new fastq file before they're mapped.
* `tenX`: set equal to `True` if the input RNA-Seq data is 10x and `False` otherwise.
* `stranded_library`: set equal to `True` if input RNA-Seq data is based on a stranded library and `False` otherwise. (for stranded libraries such as 10x, `stranded_library` should be set to `True`). When `stranded_library` is set to `True`, strand orientations from the alignment bam file will be used as the strand orientation of the junction. For unstranded libraries, SICILIAN uses gene strand information from the GTF file as the read strand is ambiguous.
* `bc_pattern`: this parameter is needed only for 10x data and determines the barcode/UMI pattern in R1. For V3 chemistry in which UMI has 12 bps, `bc_pattern` should be set to `"C"*16 + "N"*12` and for 10x data based on V2 chemistry it should be set to `"C"*16 + "N"*12`. `bc_pattern` is needed for `UMI_tools` steps before STAR alignment on input 10x data.
* `bc_pattern`: this parameter is needed only for 10x data and determines the barcode/UMI pattern in R1. For V3 chemistry in which UMI has 12 bps, `bc_pattern` should be set to `"C"*16 + "N"*12` and for 10x data based on V2 chemistry it should be set to `"C"*16 + "N"*10`. `bc_pattern` is needed for `UMI_tools` steps before STAR alignment on input 10x data.

### Choosing STAR parameters
STAR alignment parameters can be adjusted in the `STAR_map` function in `SICILIAN.py`. By default, SICILIAN runs STAR with default parameters.
Expand Down