You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran the full test suite using a branch that throws if a tool ever tries to query the FeatureCache using a query interval that is earlier than, but on the same contig as, the one currently cached. Several tests failed, including a few of the Mutect2/HC ones:
The FeatureCache assumes that queries are always increasing along a contig; the failures in this branch indicate that the caller is attempting to back up and re-query territory that has already been cached and then trimmed.
I didn't track down all of these cases, but the general pattern appears to be that active region determination results in initial caching and trimming, and then the same/similar territory is traversed again during calling, resulting in cache misses. It happens pretty frequently when running M2 tests, at least for pon and germline resource inputs; we should investigate how much a better caching strategy would help performance. If it would, we'd need #4902 at a mimimum in order to use a alternate cache strategy.
The text was updated successfully, but these errors were encountered:
I ran the full test suite using a branch that throws if a tool ever tries to query the FeatureCache using a query interval that is earlier than, but on the same contig as, the one currently cached. Several tests failed, including a few of the Mutect2/HC ones:
Mutect2IntegrationTest.testContaminationFilter
Mutect2IntegrationTest.testDreamTumorNormal
Mutect2IntegrationTest.testGivenAllelesMode
Mutect2IntegrationTest.testPon
Mutect2IntegrationTest.testTumorOnly
HaplotypeCallerIntegrationTest.testGenotypeGivenAllelesMode
The FeatureCache assumes that queries are always increasing along a contig; the failures in this branch indicate that the caller is attempting to back up and re-query territory that has already been cached and then trimmed.
I didn't track down all of these cases, but the general pattern appears to be that active region determination results in initial caching and trimming, and then the same/similar territory is traversed again during calling, resulting in cache misses. It happens pretty frequently when running M2 tests, at least for pon and germline resource inputs; we should investigate how much a better caching strategy would help performance. If it would, we'd need #4902 at a mimimum in order to use a alternate cache strategy.
The text was updated successfully, but these errors were encountered: