Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enquiry on consistent segments amplified/deleted across most samples #202

Open
michelle-jahja opened this issue Oct 1, 2024 · 3 comments

Comments

@michelle-jahja
Copy link

Hi, I was wondering if this is an issue that has been seen before. I am new to facets and copy number calling so I would appreciate any help/advice.

I have been running FACETS for several samples (WES and targeted sequencing with matched normals) from a public dataset. I noticed that in these samples, certain chromomes/ certain regions in chromosomes were weirdly amplified or deleted - looking at the log-ratio, it does not seem clear that these segments should be 'amplified'. I have attached images as examples.

These regions (chromosomes 16, 17, 19) are called as amplifications throughout almost all samples (~170 samples), which 1) biologically does not make sense as these chromosomes are usually not amplified in these cases, 2) based on karyotype information of each sample, these chromosomes should not be amplified.

I was wondering if this is a sample processing issue, or if there is a parameter in facets that can control for/account for these slight shifts in log ratio that shouldn't be called amplifications?

Could it be due to different read depths between matched normals vs tumor samples? I have also attached histograms of tumor and normal read depth of one sample - although this difference is not seen in all samples.

Here are the current parameters for running facets on WES and targeted sequencing.
# for WES xx <- preProcSample(rcmat, gbuild = "hg38", ndepth=35, cval = 150, snp.nbhd = 250) oo <- procSample(xx, cval=500)

# for targeted xx <- preProcSample(rcmat, gbuild = "hg38", ndepth=35, cval = 150, snp.nbhd = 150) oo <- procSample(xx, cval=500)

Thank you and very grateful for any thoughts/opinions!

Screenshot 2024-10-01 at 4 19 39 pm Screenshot 2024-10-01 at 4 18 08 pm Screenshot 2024-10-01 at 4 27 32 pm
@veseshan
Copy link
Collaborator

veseshan commented Nov 4, 2024

This seems like sequencing artefact. The log-ratio for total copy number is clearly different but it is such a low level and shows no allelic imbalance. Wonder why tumor alone has higher coverage than normal in those chromosomes (16, 17 and 19).

@michelle-jahja
Copy link
Author

Thanks for the response, considering that these are likely sequencing artefacts, do you have any suggestions on how I could account for these when visualising my data? Is it right to call these as 'diploid' or 'artefact' with the condition that the log-odds ratio/mafR is < a certain number? Happy to hear your thoughts - thanks!

@veseshan
Copy link
Collaborator

veseshan commented Nov 6, 2024

You can look at the variation in total coverage (normalized to say 1 million reads) in the normals by chromosome. That can provide some idea of the expected variation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants