diff --git a/README.md b/README.md index 82991b4..111ef1d 100644 --- a/README.md +++ b/README.md @@ -13,19 +13,19 @@ To run `wg-blimp` you need a UNIX environment that contains a [Bioconda](http:// It is advised to install `wg-blimp` through Bioconda. It is also recommended to install `wg-blimp` in a fresh environment, as it has many dependencies that may conflict with other packages, for this you can use: ``` -conda create -n wg-blimp wg-blimp python=3.6.7 r-base=3.6.2 snakemake-minimal=5.8.1 methyldackel==0.4.0 +conda create -n wg-blimp wg-blimp python=3.6.7 r-base=4.0.2 methyldackel==0.4.0 ``` ### Docker We bundled a full `wg-blimp` installation into a Docker container. You may pull our image using ``` -docker pull imimarw/wg-blimp:v0.9.5 +docker pull imimarw/wg-blimp:v0.9.6 ``` Once the image was downloaded and extracted, you can start the docker container with ``` -docker run -it -v : imimarw/wg-blimp:v0.9.5 +docker run -it -v : imimarw/wg-blimp:v0.9.6 ``` ### From source @@ -101,18 +101,18 @@ The following entries are used for running the Snakemake pipeline and may be spe | Key | Value | | --- | ----- | -| *annotation_allowed_biotypes* | Only genes with this biotype will be annotated in the DMR table | +| *annotation_allowed_biotypes* | Only genes with this biotype will be annotated in the DMR table (see https://www.gencodegenes.org/pages/biotypes.html ). | | *annotation_min_mapq* | When annotating coverage, only use reads with a minimum mapping quality | | *bsseq_local_correct* | Use local correction for bsseq DMR calling. Usually, setting this to FALSE will increase the number of calls. | -| *cgi_annotation_file* | Gzipped csv file used for cg island annotation. | +| *cgi_annotation_file* | Gzipped csv file used for cg island annotation. Mandatory for MethylSeekR segmentation. Usually downloaded from UCSC Table Browser. | | *computing_threads* | Number of processors a single job is allowed to use. Remember to use `--cores` parameter for Snakemake. | | *dmr_tools* | Tools to use for DMR calling. Available: `bsseq`, `camel`, `metilene` -| *gene_annotation_file* | File used for genetic annotation. | | *group1* | Samples in first group for DMR analysis | | *group2* | Samples in second group for DMR analysis | +| *gtf_annotation_file* | GTF file used for annotation of genes and promoters. | | *io_threads* | IO intensive tools virtually reserve this many cores (while actually using only one) to reduce file system IO load. | -| *methylation_rate_on_chromosomes* | Compute methylation rates for these chromosome during qc | -| *methylseekr_cgi_genome* | Reference genome to use for MethylSeekR CGI queries. | +| *java_memory_gb* | Gigabytes of RAM to allocate for Java-based tools. If samples are too large, this must be increased to prevent crashes. | +| *methylation_rate_on_chromosomes* | Compute methylation rates for these chromosome during QC | | *methylseekr_fdr_cutoff* | FDR cutoff for MethylSeekR segmentation. | | *methylseekr_methylation_cutoff* | Methylation cutoff for MethylSeekR segmentation. | | *methylseekr_pmd_chromosome* | Chromosome to compute MethylSeekR alpha values for. | @@ -121,16 +121,14 @@ The following entries are used for running the Snakemake pipeline and may be spe | *min_diff* | Minimum average difference between the two groups for DMR calling | | *output_dir* | Directory containing all files created by the pipeline | | *promoter_tss_distances* | Distance interval around TSS's to be recognized as promoters in DMR annotation. | -| *qualimap_memory_gb* | Gigabytes of RAM to allocate for Qualimap. If samples are too large, this must be increased to prevent crashes. | | *rawdir* | Directory containing .fastq files | | *rawsuffixregex* | The regular expressions to match for paired reads. By default, Illumina naming conventions are accepted. | -| *ref* | .fasta reference file | -| *repeat_masker_annotation_file* | File containing repeat masker annotation | -| *repeat_masker_links* | Repeat masker files are relatively big and are only downloaded on demand from the links specified here. | +| *ref* | .fasta reference file. "Bisulfited" references and BWA indices will be created automatically by bwa-meth) | +| *repeat_masker_annotation_file* | File containing repetitive regions. Usually generated by RepeatMasker and downloaded from UCSC Table Browser. | | *sample_fastq_csv* | Optional CSV file containing association between samples and read files. The CSV must contain a header with column names `sample`, `forward` and `reverse`. When this option is set, parameters *rawdir* and *rawsuffixregex* are ignored. | | *samples* | All samples (usually concatenation of group1 and group2) | | *target_files* | Files to be generated by the Snakemake workflow | -| *transcript_start_site_file* | File to read transcription start sites from (for DMR annotation) | +| *temp_dir* | Directory for temporary files. This option may be used for instances where computation node disk space is limited. | ## Reporting errors / Requesting features If anything goes wrong using `wg-blimp` or any features are missing, feel free to open an issue or to contact Marius Wöste ( mar.w@wwu.de ) diff --git a/setup.py b/setup.py index a0bec4a..a973816 100644 --- a/setup.py +++ b/setup.py @@ -5,7 +5,7 @@ setuptools.setup( name='wg-blimp', - version='0.9.5', + version='0.9.6', author='Marius Woeste', author_email='mar.w@wwu.de', description='WGBS methylation analysis pipeline',