Skip to content

Outlier_removal

Vivekanandan Ramalingam edited this page Mar 19, 2024 · 2 revisions

Outlier removal

Filter the peaks file for outliers. Peaks that meet either of two criteria are removed:

(1) Overlap with regions identified in blacklist.bed. (2) Number of reads in the peak is in the --quantile quantile.

First prepare a file input_outliers.json as shown below:

{
    "0": {
        "signal": {
            "source": ["ENCSR000EGM/data/plus.bw",
                       "ENCSR000EGM/data/minus.bw"]
        },
        "loci": {
            "source": ["ENCSR000EGM/data/peaks.bed"]
        },
        "bias": {
            "source": ["ENCSR000EGM/data/control_plus.bw",
                       "ENCSR000EGM/data/control_minus.bw"],
            "smoothing": [null, null]
        }
    }
}

Next, run the following command:

bpnet-outliers \
    --input-data input_outliers.json  \
    --quantile 0.99 \
    --quantile-value-scale-factor 1.2 \
    --task 0 \
    --chrom-sizes ENCSR000EGM/reference/hg38.chrom.sizes \
    --chroms $(paste -s -d ' ' ENCSR000EGM/reference/chroms.txt) \
    --sequence-len 1000 \
    --blacklist ENCSR000EGM/reference/blacklist.bed \
    --global-sample-weight 1.0 \
    --output-bed ENCSR000EGM/data/peaks_inliers.bed