Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In most runs AA_CNV_SEEDS.bed files are empty #30

Open
Yumo-Xie opened this issue Jan 5, 2023 · 7 comments
Open

In most runs AA_CNV_SEEDS.bed files are empty #30

Yumo-Xie opened this issue Jan 5, 2023 · 7 comments
Labels

Comments

@Yumo-Xie
Copy link

Yumo-Xie commented Jan 5, 2023

Hi.
I have applied the program in WGS data of several cell lines. I started from .fastq files and used PrepareAA.py to generate CNV calls. However, all of these runs generated empty AA_CNV_SEEDS.bed files. I tried the recommended GBM39 testing data [https://www.ncbi.nlm.nih.gov/sra/SRX5055022[accn]]. By this time the program found 2 amplicons (one with EGFR and the other with MYC and PVT1) GBM39_amplicon1.pdf
GBM39_amplicon2.pdf.
Is the result correct? Does that mean my program work just fine, and the empty AA_CNV_SEEDS.bed files are attributed to the data I used?

@jluebeck
Copy link
Member

jluebeck commented Jan 6, 2023

Hi,

Your GBM39 test results appear correct. An empty seeds bed file implies there are no candidate regions of focal amplification that are detected in those samples. There is also a finish_flag file which you can check to see if AmpliconSuite-pipeline completed successfully.

Thanks,
Jens

@Yumo-Xie
Copy link
Author

Yumo-Xie commented Jan 8, 2023

Thank you very much! The program also worked well for COLO320DM. It seems that the empty files are attributed to my data.

@jingydz
Copy link

jingydz commented Feb 9, 2023

Hi, my output file is also empty.
-rw-r--r-- 1 xxx 0 Feb 9 16:50 6605D_AA_CNV_SEEDS.bed
And my finish_flag file appears to be running successfully.

$ cat 6605D_finish_flag.txt
All stages completed
$ cat ./6605D_AA_results/6605D_summary.txt
#Amplicons = 0
-----------------------------------------------------------------------------------------

6605D_AA_OUT]$ cat ./6605D_classification/6605D_amplicon_classification_profiles.tsv
sample_name     amplicon_number amplicon_decomposition_class    ecDNA+  BFB+    ecDNA_amplicons

I tried several other WGS files with the same results, without cycle files, png or pdf files, etc.
my command is /Parastor300s_G30S/zhangjj/software/miniconda3/bin/python3 /parastor300/work01/zhangjj/software/AmpliconSuite-pipeline/PrepareAA.py -s 6605D -t 50 --cnvkit_dir /parastor300/work01/zhangjj/software/cnvkit/cnvkit.py --bam 6605D.bam --ref GRCh38 --downsample 10.0 -o 6605D_AA_OUT --run_AA --run_AC
My WGS data is 30X. Is this problem due to downsampling to 10x or something else?
ps. My data are from healthy people, not cancer patients.

@jluebeck
Copy link
Member

jluebeck commented Feb 9, 2023

Hi,

Your outputs appear to be correct. Keep in mind that focal amplifications almost exclusively occur in cancer and pre-cancer samples. If you are providing samples from healthy patients to AmpliconSuite, and it does not find any focal amplifications, then this is completely expected.

If you would like to try a cancer cell line, I suggest you try COLO320DM.

Thanks,
Jens

@jingydz
Copy link

jingydz commented Feb 10, 2023

Thanks, I have run the WGS data of 39 healthy people and got 4 files AA_CNV_SEEDS.bed with content so far.
image
I also tried the COLO320DM cancer cell line, and it did find a lot of focal amplification, which should indeed be the problem with my data, thank you.
time /Parastor300s_G30S/zhangjj/software/miniconda3/bin/python3 /parastor300/work01/zhangjj/software/AmpliconSuite-pipeline/PrepareAA.py -s COLO320DM -t 10 --cnvkit_dir /parastor300/work01/zhangjj/software/cnvkit/cnvkit.py --fastqs COLO320DM_r1.fastq.gz COLO320DM_r2.fastq.gz --ref hg38 -o COLO320DM_AA_OUT --run_AA --run_AC
image

@iamyingzhou
Copy link

Hi,

Your outputs appear to be correct. Keep in mind that focal amplifications almost exclusively occur in cancer and pre-cancer samples. If you are providing samples from healthy patients to AmpliconSuite, and it does not find any focal amplifications, then this is completely expected.

If you would like to try a cancer cell line, I suggest you try COLO320DM.

Thanks, Jens

Dear Jens,
Is it possible to detect extrachromosomal circular DNA (eccDNA) in plasma samples from patients with specific chronic diseases?
Thanks!

@jluebeck
Copy link
Member

jluebeck commented Jun 8, 2023

Hi Yingzhou,

AA is designed to detect large (>10kbp), focally amplified ecDNA. If the eccDNA in question are smaller, or if they are not amplified then AA will very likely not detect them.

Thanks,
Jens

@jluebeck jluebeck added the FAQ label Jul 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants