Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

baf fixes #7861

Merged
merged 1 commit into from
Jun 21, 2022
Merged

baf fixes #7861

merged 1 commit into from
Jun 21, 2022

Conversation

tedsharpe
Copy link
Contributor

  1. Allows negative BAFs (so that we can process legacy files).
  2. Skips same-site vcf records so that we can glide over the redundancies in the dbsnp sites vcf.

@tedsharpe tedsharpe requested a review from mwalker174 May 19, 2022 15:14
@codecov
Copy link

codecov bot commented May 19, 2022

Codecov Report

Merging #7861 (2e27b12) into master (add34e1) will increase coverage by 0.062%.
The diff coverage is 79.736%.

❗ Current head 2e27b12 differs from pull request most recent head 6e94ccd. Consider uploading reports for the commit 6e94ccd to get more accurate results

@@               Coverage Diff               @@
##              master     #7861       +/-   ##
===============================================
+ Coverage     86.930%   86.992%   +0.062%     
- Complexity     36954     36960        +6     
===============================================
  Files           2221      2221               
  Lines         173829    173904       +75     
  Branches       18779     18784        +5     
===============================================
+ Hits          151110    151283      +173     
+ Misses         16087     15982      -105     
- Partials        6632      6639        +7     
Impacted Files Coverage Δ
...roadinstitute/hellbender/tools/sv/BafEvidence.java 70.000% <ø> (+2.258%) ⬆️
...institute/hellbender/tools/sv/PrintSVEvidence.java 77.500% <ø> (ø)
...ellbender/tools/DumpTabixIndexIntegrationTest.java 100.000% <ø> (ø)
...ute/hellbender/utils/codecs/SiteDepthBCICodec.java 10.714% <10.714%> (ø)
.../broadinstitute/hellbender/tools/sv/SiteDepth.java 62.857% <33.333%> (ø)
...titute/hellbender/utils/codecs/SiteDepthCodec.java 74.194% <75.000%> (ø)
...dinstitute/hellbender/tools/sv/SiteDepthtoBAF.java 82.418% <80.952%> (ø)
...hellbender/tools/walkers/sv/CollectSVEvidence.java 79.878% <86.207%> (+29.058%) ⬆️
...er/tools/walkers/sv/CollectSVEvidenceUnitTest.java 97.521% <97.500%> (-0.010%) ⬇️
...itute/hellbender/tools/sv/SiteDepthSortMerger.java 100.000% <100.000%> (ø)
... and 15 more

Copy link
Contributor

@mwalker174 mwalker174 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, looks good! It just occurred to me that we're missing some testing for BAF collection. Otherwise, minor comments.

Arrays.sort(vals);
final int midIdx = nBafs / 2;
final double median =
(nBafs & 1) != 0 ? vals[midIdx] : (vals[midIdx] + vals[midIdx - 1])/2.;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity, what's the & 1 for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(i & 1) is the way that 133t h4x0r d00dz such as myself test for odd integers. will change to (i % 2).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see it now

Comment on lines -23 to -24
Utils.validateArg(value >= 0. || value == MISSING_VALUE,
"value must be non-negative or missing");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add tests for values 0 and <0?

@@ -591,7 +591,7 @@ public AlleleCounter( final SAMSequenceDictionary dict,
final FeatureDataSource<VariantContext> snpSource =
new FeatureDataSource<>(inputPath.toPath().toString());
dict.assertSameDictionary(snpSource.getSequenceDictionary());
this.snpSourceItr = snpSource.iterator();
this.snpSourceItr = new BAFSiteIterator(snpSource.iterator());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we're missing tests for BAF, can you add unit/integration tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BAFEvidence unit test also tests the codecs by making round trips. I could add an integration test for each codec. Would that suffice?

@mwalker174
Copy link
Contributor

Actually, a second thing just popped into my head. The name LocusDepth makes perfect sense but the abbreviation LD is overloaded with Linkage Disequilibrium. Not a huge problem in context, but it's likely the abbreviation will make its way into VCF annotations in the future, which could create confusion.

Is there another name we could use? I think SiteDepth or BaseDepth are also clear and unambiguous. @cwhelan any thoughts?

@cwhelan
Copy link
Member

cwhelan commented Jun 3, 2022

SiteDepth seems like it would work well in this context.

@tedsharpe
Copy link
Contributor Author

Regarding finding a new name for LocusDepth: I'm fine with SiteDepth, but it seems to me that what really makes this class different from DepthEvidence, for example, is that it measures the read depth for each possible call. Should it be AlleleDepth?

@tedsharpe
Copy link
Contributor Author

I believe the most recent commit addresses all review comments except for "missing tests for BAF".

@gatk-bot
Copy link

gatk-bot commented Jun 6, 2022

Github actions tests reported job failures from actions build 2448670766
Failures in the following jobs:

Test Type JDK Job ID Logs
integration 11 2448670766.12 logs
integration 8 2448670766.0 logs

@mwalker174
Copy link
Contributor

Regarding finding a new name for LocusDepth: I'm fine with SiteDepth, but it seems to me that what really makes this class different from DepthEvidence, for example, is that it measures the read depth for each possible call. Should it be AlleleDepth?

I think that makes sense too but could be confusing mainly because AlleleDepth is already a haplotypecaller VCF field AD, which refers to called alleles rather than all possible alleles and also can refer to non-SNP alleles.

@tedsharpe
Copy link
Contributor Author

I think this is ready to go. Differences of opinion welcomed.

@tedsharpe
Copy link
Contributor Author

The coverage issues were, I believe, due to failing tests.

@tedsharpe
Copy link
Contributor Author

I believe this is ready now. Added a couple of unit tests on the AlleleCounter. @mwalker174: Can I get another quick look from you?

Copy link
Contributor

@mwalker174 mwalker174 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! A couple of tiny comments and then good to merge

summary = "Convert LocusDepth to BafEvidence",
oneLineSummary = "Convert LocusDepth to BafEvidence",
summary = "Convert SiteDepth to BafEvidence",
oneLineSummary = "Convert SiteDepth to BafEvidence",
programGroup = StructuralVariantDiscoveryProgramGroup.class
)
@ExperimentalFeature
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@ExperimentalFeature
@ExperimentalFeature
@DocumentedFeature

/** Codec to handle SiteDepths in BlockCompressedInterval files */
public class SiteDepthBCICodec extends AbstractBCICodec<SiteDepth> {
private boolean versionChecked = false;
private static final String LD_BCI_FILE_EXTENSION = ".sd.bci";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static final String LD_BCI_FILE_EXTENSION = ".sd.bci";
private static final String SD_BCI_FILE_EXTENSION = ".sd.bci";

public class LocusDepthtoBAFIntegrationTest extends CommandLineProgramTest {
public static final String ld2bafTestDir = toolsTestDir + "walkers/sv/LocusDepthtoBAF/";
public class SiteDepthtoBAFIntegrationTest extends CommandLineProgramTest {
public static final String ld2bafTestDir = toolsTestDir + "walkers/sv/SiteDepthToBAF/";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public static final String ld2bafTestDir = toolsTestDir + "walkers/sv/SiteDepthToBAF/";
public static final String sd2bafTestDir = toolsTestDir + "walkers/sv/SiteDepthToBAF/";

ld files become sd files
@tedsharpe tedsharpe merged commit 6170e83 into master Jun 21, 2022
@tedsharpe tedsharpe deleted the tws_bafFixes branch June 21, 2022 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants