-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates to VcfComparator #8973
Updates to VcfComparator #8973
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Some minor suggestions.
private boolean ignoreGq0 = false; | ||
|
||
@Argument(fullName = "ignore-multi-allelics", optional = true, doc="Ignore sites where the AC length in the actual matches the actual number of alleles, but doesn't match the expected VC.") | ||
private boolean ignoreSomeMultiAllelics = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. Should this be ignoreMultiAllelics
to be consistent with the fullName (or the other way around?).
@@ -530,7 +590,7 @@ private void checkAttributes(final Map<String, Object> actual, final Map<String, | |||
throw makeVariantExceptionFromDifference(key, Double.toString(actualPerAlleleValue), Double.toString(expectedPerAlleleValue)); | |||
} | |||
} | |||
if (key.contains("AS_") && key.contains("RankSum")) { | |||
if (key.contains("AS_") && key.contains("RankSum") && !key.contains("RAW")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think
if (key.contains("AS_") && key.contains("RankSum") && !key.contains("RAW")) { | |
if (key.startsWith("AS_") && key.contains("RankSum") && !key.contains("RAW")) { |
is a little more correct?
@@ -546,11 +606,25 @@ private void checkAttributes(final Map<String, Object> actual, final Map<String, | |||
logger.warn("GATK version-specific NaN versus empty AS_RAW annotation discrepancy"); | |||
} | |||
} | |||
} else if (key.contains("AS_") && key.contains("RAW") && key.contains("RankSum")) { | |||
if (((String) actualList.get(i)).isEmpty() && ((String) expectedList.get(i)).isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about using startsWith
here.
@@ -515,7 +568,14 @@ private void checkAttributes(final Map<String, Object> actual, final Map<String, | |||
} | |||
} | |||
//we've already gotten rid of stars... I think -- do we have to sort the alts? Maybe... I should be using my own AlleleSpecificAnnotationData!!! | |||
final int iterationEnd = ignoreNonRefData && actualAlts.contains(Allele.NON_REF_ALLELE) ? expectedAlts.size()-1 : expectedAlts.size(); //exclusive | |||
int iterationEnd = ignoreNonRefData && actualAlts.contains(Allele.NON_REF_ALLELE) ? expectedAlts.size()-1 : expectedAlts.size(); //exclusive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could keep the final
here by doing something like:
final iterationEnd = if (key.equals(...) {
3
} else if (key.equals(...) {
5
} else {
...
}
if (key.equals(GATKVCFConstants.RAW_GENOTYPE_COUNT_KEY)) { | ||
iterationEnd = 3; | ||
} | ||
if (key.equals(GATKVCFConstants.RAW_MAPPING_QUALITY_WITH_DEPTH_KEY)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else if ?
These changes add arguments specific to the update from GATK 4.5.0.0 -> 4.6.0.0 that special case sites that were flagged previously in the WARP tests. Most of the sites that can now be skipped are based on the no call changes that were expected with this update to JointCalling and ReblockGVCFs. There are also some small changes to HaplotypeCaller at low quality sites that are then dropped by ReblockGVCFs.
Additionally there were some expected changes to the Ultima pipelines in HaplotypeCaller and JointCalling which can now be skipped by the VcfComparator tool.
Finally if AD is 0 for non-ref reads (which can happen with DRAGEN input), then AS_QD has jitter added which is now accounted for.