Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish AutoBA1BS1BS2PS4PM2 #122

Closed
gromdimon opened this issue May 30, 2024 · 1 comment · Fixed by #131
Closed

Finish AutoBA1BS1BS2PS4PM2 #122

gromdimon opened this issue May 30, 2024 · 1 comment · Fixed by #131
Assignees
Labels
enhancement New feature or request

Comments

@gromdimon
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
We've implemented #71 . Now we need to finish it

Describe the solution you'd like

  • Implement methods in the class
  • Add unit tests
  • Add integration tests
  • Add docstrings

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Here is some info for these criteria

PS4 (prevalence)

No automation has been implemented.

Original Definition

The prevalence of the variant in affected individuals is significantly increased compared to the prevalence in controls

Note 1: Relative risk (RR) or odds ratio (OR), as obtained from case-control studies, is >5.0 and the confidence interval around the estimate of RR or OR does not include 1.0. See manuscript for detailed guidance.

Note 2: In instances of very rare variants where case-control studies may not reach statistical significance, the prior observation of the variant in multiple unrelated patients with the same phenotype, and its absence in controls, may be used as moderate level of evidence.

-- Richards et al. (2015); Table 4

PM2

PM2_Supporting (absent from controls)

Original Definition

Absent from controls (or at extremely low frequency if recessive) in Exome Sequencing Project, 1000 Genomes or ExAC.

-- Richards et al. (2015); Table 4

Preconditions / Precomputations

  • Determine :ref:acmg_seqvars_criteria-inheritance for the gene.
  • Determine :ref:acmg_seqvars_criteria-frequency.
  • If the allele frequency is invalid then this criterion is skipped.

Implemented Criterion

  • If the variant is on a nuclear chromosome:
    • If the gene is marked as recessive or X-linked:
      • If the variant allele count is <=4 then this criterion is triggered.
    • If the gene is marked as dominant:
      • If the homozygous allele count is <=1 then this criterion is triggered.
      • If the allele frequency is less than 0.0001 then this criterion is triggered.
  • If the variant is on chrMT:
    If the variant frequency is below 0.00002=0.002%=1/50,000 then this criterion is triggered.

User Report

  • The values and thresholds used by the criterion even if failed.

Literature

  • Richards et al. (2015) describes the original criterion.
  • ClinGen Sequence Variant Interpretation Work Group (2020): SVI Recommendation for Absence/Rarity (PM2) - Version 1.0 describes the downgrade to supporting.
  • McCormick et al. (2020) describe the ACMG criteria for chrMT variants.

Caveats

  • We currently use the threshold from PMID:30376034 <https://pubmed.ncbi.nlm.nih.gov/30376034/>__ and are lacking our own calibration.
  • This criterion has been downgraded by default to supporting from strong in accordance to ClinGen Sequence Variant Interpretation Work Group (2020): SVI Recommendation for Absence/Rarity (PM2) - Version 1.0

BA1

BA1 (5% frequency)

Original Definition

Allele frequency is >5% in Exome Sequencing Project, 1000 Genomes Project, or Exome Aggregation Consortium

-- Richards et al. (2015); Table 4

Preconditions / Precomputations

  • The variant is absent from the exception list from Ghosh et al. (2018).
    If the variant is present on this list, then this criterion is skipped.

Implemented Criterion

  • If the variant is nuclear (not on chrMT)
    • If the allele frequency is above 0.05 in gnomAD global population then this criterion is triggered.
  • else (the variant is on chrMT)
    • If the allele frequency is above 0.01 in gnomAD-mtDNA global population then this criterion is triggered.

User Report

  • The variant frequency.

Literature

  • Richards et al. (2015) describes the 5% allele frequency threshold.
  • Ghosh et al. (2018) introduce the exception list and ClinGen maintains it.
  • McCormick et al. (2020) describe the 1% allele frequency threshold as appropriate for chrMT variants.

Caveats

  • The exception "However, there must be no additional conflicting evidence to support pathogenicity, such as a novel occurrence in a certain haplogroup" from McCormick et al. (2020) is not implemented yet.

BS1

BS1 (expected frequency)

Original Definition

Allele frequency greater than expected for disorder.

-- Richards et al. (2015); Table 4

Preconditions / Precomputations

  • Determine :ref:acmg_seqvars_criteria-frequency.
  • If the allele frequency is invalid then this criterion is skipped.

Implemented Criterion

  • If the variant is on a nuclear chromosome and the user provided a maximal credible population frequency:
    • If the FAF from gnomAD is above the maximal credible population frequency then this criterion is triggered.
  • If the variant is on chrMT:
    • If the population frequency is above 0.5% then this criterion is triggered in accordance to McCormick et al. (2020).

User Report

  • The variant frequency and again the user specified maximal credible population frequency for nuclear variants.
  • The variant frequency and the 0.5% threshold for chrMT variants.

Literature

  • Richards et al. (2015) describes the original criterion without thresholds.
  • Gudmundsson et al. (2022) describe the FAF threshold provided by gnomAD.
  • McCormick et al. (2020) describe the ACMG criteria for chrMT variants.

BS2

BS2 (healthy adult)

Original Definition

Observed in a healthy adult individual for a recessive (homozygous), dominant (heterozygous), or X-linked (hemizygous) disorder, with full penetrance expected at an early age.

-- Richards et al. (2015); Table 4

Preconditions / Precomputations

  • If the criterion BA1 triggered then this criterion is skipped.
  • Determine :ref:acmg_seqvars_criteria-inheritance for the gene.
  • Determine :ref:acmg_seqvars_criteria-frequency.
  • If the allele frequency is invalid then this criterion is skipped.
  • If the criterion BA1 was triggered then this criterion is skipped.

Implemented Criterion

  • If the gene is marked as recessive or X-linked:
    • If the variant allele count is above 2 then this criterion is triggered.
  • If the gene is marked as dominant:
    • If the variant allele count is above 5 then this criterion is triggered.

User Report

  • The variant frequency and the threshold used.

Literature

  • Chen et al. (2022), Karczewski et al. (2020), etc. describe gnomAD.
  • The modes of inheritance for the genes are taken from different sources as described in :ref:acmg_seqvars_criteria-inheritance.

Caveats

  • The conditions of "full penetrance" and "expected at an early age" need to be checked by the user.

Notes

  • Genes can be marked as both recessive and dominant.
  • We use the thresholds from PMID:30376034 <https://pubmed.ncbi.nlm.nih.gov/30376034/>__.

Intervar

BA1, BS1, BS2, PS4, and PM2 by Automated Scoring
The AAFs in control populations are useful for scoring the pathogenicity of variants, given that frequently occurring variants in the population are unlikely to cause rare diseases. We retrieved information on disease prevalence from OrphaNet and translated OrphaNet identifiers into OMIM identifiers. Here, we used three datasets to assess the variant frequency: the NHLBI Exome Sequencing Project (ESP6500), 1000 Genomes Project, and ExAC Browser. If any of the AAFs in any database is >5%, BA1 will be assigned as 1. If the AAF in the ExAC Browser is great than expected for the disorder caused by mutations in the corresponding gene, BS1 will be assigned as 1 (here, we set a default cutoff as 1% for rare disease, but users can specify their own cutoff in the configuration file of InterVar). If a variant is observed in a healthy adult in the 1000 Genomes Project as a homozygote (for diseases defined as recessive in OMIM) or as a heterozygote otherwise, then BS2 will be applied. We manually removed known major adult-onset disorders from consideration. We did not use the ExAC Browser or ESP6500 here because these datasets can contain variants from individuals with various diseases.
Variants that are absent or are present at extremely low frequencies in a large control cohort could represent moderate evidence for pathogenicity. If a variant that is responsible for dominant diseases is absent in all control subjects from ESP6500, 1000 Genomes Project, and the ExAC Browser, PM2 will be applied. If the variant causes recessive diseases and has a very low frequency with AAF < 0.5%, then PM2 can also be applied. Information on the gene-disease relationship, such as dominance or recessiveness, is obtained from OMIM.
In some cases, pathogenic variants have a significantly higher frequency in affected subjects than in control subjects. To handle these variants, we also cataloged all variants with an odds ratio (OR) > 5.0 from GWASdb34 version 2. For these variants, PS4 will be applied. For some rare variants where case-control studies might not reach statistical significance, PS4 also can be downgraded to a moderate level during the manual adjustment step.

@gromdimon gromdimon added the enhancement New feature or request label May 30, 2024
@gromdimon gromdimon self-assigned this May 30, 2024
@holtgrewe
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants