-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed a bug in AlleleFiltering that ignored more than a single sample #8841
Fixed a bug in AlleleFiltering that ignored more than a single sample #8841
Conversation
@@ -57,14 +60,14 @@ int getAlleleLikelihoodVsInverse(final AlleleLikelihoods<GATKRead, Allele> allel | |||
|
|||
final GenotypingLikelihoods<Allele> genotypingLikelihoods = genotypesModel.calculateLikelihoods(alleleList, | |||
genotypingData, null, 0, null); | |||
AFCalculationResult af = afCalc.fastCalculateDiploidBasedOnGLs(genotypingLikelihoods, genotypingEngine.getPloidyModel().totalPloidy()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed also some lines that were never used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super familiar with this part of the code, but it looks like this change makes sense in that it's conservatively providing the lowest PL across all samples. Because of this lack of confidence I wanted to double check that I understand how the test here works.
for (int i = 0; i < genotypingLikelihoods.numberOfSamples(); i++) { | ||
final int[] pls = genotypingLikelihoods.sampleLikelihoods(i).getAsPLs(); | ||
perSamplePLs.add(Math.min(pls[1] - pls[0], pls[2] - pls[0])); | ||
logger.debug(() -> String.format("GAL:: %s: %d %d %d", allele.toString(), pls[0], pls[1], pls[2])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you have multiple samples here you might want to add the sample name to the debug logger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed, good idea
|
||
AlleleFiltering alleleFiltering = new AlleleFilteringHC(hcArgs, null, genotypingEngine); | ||
AlleleLikelihoods<GATKRead, Haplotype> filtered_lks = alleleFiltering.filterAlleles(lks, 0, new HashSet<>()); | ||
Assert.assertEquals(filtered_lks.alleles(), lks.alleles()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this checks that no alleles were filtered, but I'm confused how this tests the change that you added. Since it's taking the minimum PL across samples now (rather than the minimum PL from the first sample) then isn't it only possible to filter more alleles with the new version compared to the old version? Did this test fail with the original code before your change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment below. Yes, this test failed in the old version.
I am actually more conservative filtering the alleles now I think.(allele needs to be weak in both samples to be removed)
@meganshand - I apologize - I may have confused the semantics. |
Ah of course. Sorry about that, I flipped PLs in my head. This looks good, I don't think you need to clarify further! |
@meganshand - addressed your comment, please take a look, thanks! |
This PR fixes an Issue raised by one of our customers that allele filtering did not work correctly when variant calling was done on multiple samples.