Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data issue: incorrect stop positions #1897

Open
rmadupuri opened this issue Aug 23, 2023 · 0 comments
Open

Data issue: incorrect stop positions #1897

rmadupuri opened this issue Aug 23, 2023 · 0 comments

Comments

@rmadupuri
Copy link
Collaborator

For SNPs:

select cs.cancer_study_identifier, me.*
from mutation_event as me
join mutation as m on me.mutation_event_id = m.mutation_event_id
join genetic_profile as gp on m.genetic_profile_id = gp.genetic_profile_id
join cancer_study as cs on gp.cancer_study_id = cs.cancer_study_id
where variant_type in ('SNP', 'DNP', 'TNP', 'MNP', 'ONP')
and end_position != start_position + length(tumor_seq_allele)-1;

Most of these variants have common prefixes which once removed are either INS, DELs but are annotated as SNPs.

metastatic_solid_tumors_mich_2017 36
prad_fhcrc 10
sclc_cancercell_gardner_2017 8
msk_impact_2017 5
acc_2019 4
histiocytosis_cobi_msk_2019 3
mrt_bcgsc_2016 2
msk_ch_2020 2
pan_origimed_2020 2
skcm_yale 2
tmb_mskcc_2018 2
acyc_mda_2015 1
luad_tsp 1
mds_iwg_2022 1
mds_mskcc_2020 1
msk_met_2021 1
pog570_bcgsc_2020 1

For INS type: 93 studies affected.

select cs.cancer_study_identifier, me.*
from mutation_event as me
join mutation as m on me.mutation_event_id = m.mutation_event_id
join genetic_profile as gp on m.genetic_profile_id = gp.genetic_profile_id
join cancer_study as cs on gp.cancer_study_id = cs.cancer_study_id
where variant_type = 'INS'
and end_position != start_position + 1;

For DELs:

select cs.cancer_study_identifier, me.*
from mutation_event as me
join mutation as m on me.mutation_event_id = m.mutation_event_id
join genetic_profile as gp on m.genetic_profile_id = gp.genetic_profile_id
join cancer_study as cs on gp.cancer_study_id = cs.cancer_study_id
where variant_type = 'DEL'
and end_position != start_position + length(reference_allele)-1;

coadread_tcga_pan_can_atlas_2018 6
cscc_ucsf_2021 3
stad_tcga_pan_can_atlas_2018 3
gbm_tcga_pan_can_atlas_2018 2
lusc_cptac_2021 2
all_stjude_2015 1
blca_tcga_pan_can_atlas_2018 1
cesc_tcga_pan_can_atlas_2018 1
crc_msk_2017 1
hnsc_tcga_pan_can_atlas_2018 1
mbl_dkfz_2017 1
ov_tcga_pan_can_atlas_2018 1
sarcoma_mskcc_2022 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant