Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CalculateGenotypePostiors minor updates to javadoc and logger type #5601

Merged
merged 2 commits into from
Jan 30, 2019

Conversation

sooheelee
Copy link
Contributor

  • Clarify tool documentation:
    • Update joint posteriors (JL) to joint posteriors (JP)
    • Remove statistical notes and provide link to GATK Article#11074 for background and math
    • Consolidate Notes and Caveats sections
    • Clarify at top the three different sources of priors and tool behavior regarding these
    • Clarify for family priors the tool only considers trio groups
    • Add Laura's comment that recent updates allow the tool to appropriately apply priors to indels
  • Change logger.info to logger.warn for situation where trio pedigree file is incomplete
    • Note that in this situation, in the absence of other refinement, the results are identical to the input

- Clarify tool documentation:
    - Remove statistical notes and provide link to GATK Article#11074 for background and math
    - Consolidate Notes and Caveats sections
    - Clarify at top the three different sources of priors and tool behavior regarding these
    - Clarify for family priors the tool only considers trio groups
    - Add Laura's comment that recent updates allow the tool to appropriately apply priors to indels
- Change logger.info to logger.warn for situation where trio pedigree file is incomplete
    - Note that in this situation, in the absence of other refinement, the results are identical to the input
@sooheelee
Copy link
Contributor Author

Thank you at @ldgauthier for volunteering to review and for all of the feedback. The Travis checks are still in progress but the only code change is with the logger type.

@codecov-io
Copy link

codecov-io commented Jan 23, 2019

Codecov Report

Merging #5601 into master will decrease coverage by 0.026%.
The diff coverage is 100%.

@@               Coverage Diff               @@
##              master     #5601       +/-   ##
===============================================
- Coverage     87.047%   87.021%   -0.026%     
- Complexity     31525     31624       +99     
===============================================
  Files           1928      1934        +6     
  Lines         145343    145816      +473     
  Branches       16090     16107       +17     
===============================================
+ Hits          126517    126891      +374     
- Misses         12966     13048       +82     
- Partials        5860      5877       +17
Impacted Files Coverage Δ Complexity Δ
...kers/variantutils/CalculateGenotypePosteriors.java 92.857% <100%> (ø) 17 <0> (ø) ⬇️
...walkers/genotyper/afcalc/AFCalculatorProvider.java 22.222% <0%> (-44.444%) 2% <0%> (-2%)
...notyper/afcalc/ConcurrentAFCalculatorProvider.java 50% <0%> (-33.333%) 1% <0%> (-1%)
...nder/utils/downsampling/PositionalDownsampler.java 88.462% <0%> (-11.538%) 22% <0%> (+1%)
...er/engine/spark/datasources/VariantsSparkSink.java 78.125% <0%> (-11.53%) 8% <0%> (-1%)
...broadinstitute/hellbender/engine/FeatureInput.java 88.235% <0%> (-4.82%) 19% <0%> (+1%)
...r/tools/walkers/mutect/Mutect2FilteringEngine.java 80.743% <0%> (-4.581%) 89% <0%> (ø)
...rs/genotyper/afcalc/FixedAFCalculatorProvider.java 88.889% <0%> (-3.704%) 12% <0%> (ø)
...ute/hellbender/utils/test/FuncotatorTestUtils.java 95.161% <0%> (-3.084%) 7% <0%> (+1%)
...GATKPlugin/GATKReadFilterPluginDescriptorTest.java 88.62% <0%> (-1.76%) 48% <0%> (+1%)
... and 64 more

@sooheelee
Copy link
Contributor Author

Testing with the branch gives the expected WARN.

10:36:48.710 WARN  CalculateGenotypePosteriors - No PED file passed or no *non-skipped* trios found in PED file. Skipping family priors.

The WARN is in the middle of the INFO fields and not separate at the bottom of the stdout:
screenshot 2019-01-24 10 37 19

Copy link
Contributor

@ldgauthier ldgauthier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some clarifications.

* independently. This assumption of independent draws follows from the assumption of Hardy-Weinberg equilibrium (HWE).
* Thus, HWE is imposed on the likelihoods as a result of CalculateGenotypePosteriors.</p>
* <p>
* The tool uses priors from three different data sources: (i) one or more supporting germline population callsets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we say something more like "The tool can use priors from three different data sources"? Most cases won't use all three.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the annotations, specifically?

The tool will use MLEAC if available or AC if MLEAC is not provided. AN is also required unless genotypes are provided for all samples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ldgauthier, the tool document already covers the MLEAC/AC annotations in the bullet point on L51.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing 'The tool uses priors from' to 'The tool can use priors from'. Will add in the specific annotations exactly as you word it in a section below as it is much clearer than the current bullet point.

*
* <h3>Inputs</h3>
* <p>
* <ul>
* <li>A VCF with genotype likelihoods, and optionally genotypes, AC/AN fields, or MLEAC/AN fields.</li>
* <li>(Optional) A PED pedigree file containing the description of the relationships between individuals.</li>
* <li>(Optional) A PED pedigree file containing the description of the relationships between individuals. Only
* trio groups are considered in the calculations.</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be absolutely specific that we mean mother-father-offspring trios?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is meant by trios is I think clear. However, I see now we do not mention this anywhere and so I will add this.

@@ -80,6 +86,17 @@
* For any non-SNP sites in the input callset, flat priors are applied.
* </p>
*
* <p>
* For the latest versions of the tool in which this message appears, the tool appropriately applies priors to indels.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as of 4.0.5.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. Thank you.

@ldgauthier
Copy link
Contributor

Back to you @sooheelee

- Add tool order of ingestion of annotations MLEAC vs AC and detail on AN requirement
- Specifically define that trios are mother-father-offspring
- Add version that appropriately applies priors to indels (4.0.5.0)
- Smooth writing
@sooheelee
Copy link
Contributor Author

I've incorporated your feedback @ldgauthier.

Copy link
Contributor

@ldgauthier ldgauthier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Sorry I missed this in the flurry of github emails yesterday.

@sooheelee sooheelee merged commit 78df6b2 into master Jan 30, 2019
@sooheelee sooheelee deleted the shlee_cgp branch January 30, 2019 22:44
@sooheelee
Copy link
Contributor Author

No worries Laura. Thanks again for the feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants