Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when no peaks are found #128

Closed
ktrns opened this issue Dec 3, 2019 · 12 comments
Closed

Error when no peaks are found #128

ktrns opened this issue Dec 3, 2019 · 12 comments
Assignees
Labels
feature-request Request to add new functionality improve-behaviour Optimise code without adding functionality WIP Work in progress
Milestone

Comments

@ktrns
Copy link

ktrns commented Dec 3, 2019

Hi there!

This is an exception that you might want to catch: the experiment of the client failed and we don't see much of an enrichment, and hence no peaks are found at all. Your pipeline crashes with "Error executing process > 'PeakQC'" (plot_macs_qc.r, plot_homer_annotatepeaks.r, cat peak_annotation_header.txt macs_annotatePeaks.summary.txt > macs_annotatePeaks.summary_mqc.tsv)

  Error in read.table(PeakFiles[idx], sep = "\t", header = FALSE) :
    no lines available in input
  Execution halted

Would you be able to fix this so that the pipeline can finish off even if no peaks are found?

Best wishes
Katrin

@drpatelh
Copy link
Member

drpatelh commented Dec 3, 2019

Hi @ktrns ! Yes, this is likely related to #35 😓 It will take quite a bit of refactoring to ignore this error and resume for the rest of the samples mainly because of the way the experimental design is incorporated into the pipeline for differential binding analysis etc.

For now, the simplest fix is to just remove that sample from your design file and re-run. Sorry.

Would you mind sending me the output for the number of lines in each of the peak files. So if you called broad peaks then it would be something like:
wc <OUTPUT_DIR>/results/bwa/mergedLibrary/macs/broadPeak/*.broadPeak

or for narrow peaks:
wc <OUTPUT_DIR>/results/bwa/mergedLibrary/macs/narrowPeak/*.narrowPeak

@ktrns
Copy link
Author

ktrns commented Dec 4, 2019

Hi @drpatelh,

Sure: 0 for both non-control samples.
bwa/mergedLibrary/Macs/narrowPeak

So if I remove my zero-peak samples, there is nothing left. But never mind, for now I can provide the atacseq pipeline results! I was just curious to also run the chipseq pipeline.

Best wishes and many thanks
Katrin

@drpatelh
Copy link
Member

drpatelh commented Dec 4, 2019

Ah, so what did you provide as a control to the chipseq pipeline then? I'm assuming this is Cut N Run data?

@drpatelh drpatelh added feature-request Request to add new functionality improve-behaviour Optimise code without adding functionality labels Dec 4, 2019
@ivokwee
Copy link

ivokwee commented May 30, 2020

I have the same problem. Is it maybe possible to skip this 'peakQC' step using some run options variable?

@drpatelh
Copy link
Member

Quite busy at the moment with the viralrecon pipeline but I can come back to this hopefully next week to see if I can do something in the next release which isnt far off now 👍

However, it would be great if you can share a minimal dataset with me to be able to reproduce this error @ivokwee @ktrns? Just an IP and control will be enough 🙂

@ivokwee
Copy link

ivokwee commented May 30, 2020 via email

@drpatelh
Copy link
Member

Yes, I was planning to do that already but that isn't going to solve this problem because the pipeline will just fail at a later step as there are no peaks called. This is why I would need an appropriate dataset to test this on.

In the meantime, you can just remove the sample from your design file and re-run 👍

@ivokwee
Copy link

ivokwee commented May 30, 2020

Well, I cloned the git code, disabled the peakQC module, and re-ran.

Here are some example files to test:
https://www.dropbox.com/s/8z0q68yajhxoqa8/nftest.tgz?dl=0ll

Also, the error is in read.table (it doesn't like an empty file). I guess we need to make dummy file with zero length read segments or something. or wrap each read with try().

Ivo

@drpatelh
Copy link
Member

drpatelh commented Jun 2, 2020

Great! Thanks for the really nice dataset to reproduce the error @ivokwee 😎 Saved me a bit of work. I have added the ability to --skip_peak_qc and --skip_peak_annotation.

But as I mentioned this is a more deep-rooted issue in the way that the pipeline is currently written. Basically, variables like replicatesExist and multipleGroups may be affected by the fact that some samples drop-out due to having 0 peaks. This would need to be taking into account immediately after the MACS2 processes for all samples have completed. Will need to revisit this at some point.

@j-andrews7
Copy link

Another bump for this thread to say this has been a frequent issue for our group, as we frequently FLAG-tag proteins and do FLAG ChIPs but include samples with no FLAG proteins as controls for the antibody.

@drpatelh drpatelh added this to the 2.0 milestone Mar 2, 2022
@JoseEspinosa JoseEspinosa added the WIP Work in progress label Mar 30, 2022
@JoseEspinosa JoseEspinosa self-assigned this Mar 30, 2022
@JoseEspinosa
Copy link
Member

@drpatelh proposed on slack the following issue that if the files have 0 bytes we may be able to do something like this

@JoseEspinosa
Copy link
Member

Should have been fixed in #268 I tested and works in my hands but will be nice if someone else could give it a try. I will close the issue by now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Request to add new functionality improve-behaviour Optimise code without adding functionality WIP Work in progress
Projects
None yet
Development

No branches or pull requests

5 participants