Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Engine read caching. #4902

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Engine read caching. #4902

wants to merge 3 commits into from

Conversation

cmnbroad
Copy link
Collaborator

Allow tools to opt-in for read caching (enabled in this PR for CNNScoreVariants and FilterAlignmentArtifacts) when issuing forward-only queries such as those used to query a ReadsContext for all reads backing each variant.

The cache matches the (sometimes surprising - see #4901) results that are returned when caching is not used, specifically that an unmapped but placed read that is mated with a mapped read is only returned if the start position overlaps the interval query interval, whereas the mapped read is returned if any part of the read overlaps the query interval.

@codecov-io
Copy link

codecov-io commented Jun 15, 2018

Codecov Report

Merging #4902 (bedd108) into master (c6eb337) will increase coverage by 68.201%.
The diff coverage is 91.690%.

❗ Current head bedd108 differs from pull request most recent head 3db7949. Consider uploading reports for the commit 3db7949 to get more accurate results

@@               Coverage Diff                @@
##              master     #4902        +/-   ##
================================================
+ Coverage     18.644%   86.845%   +68.201%     
- Complexity      4635     32366     +27731     
================================================
  Files           1261      1994       +733     
  Lines          73745    149492     +75747     
  Branches       11768     16506      +4738     
================================================
+ Hits           13749    129826    +116077     
+ Misses         57944     13642     -44302     
- Partials        2052      6024      +3972     
Impacted Files Coverage Δ
...e/hellbender/engine/FeatureDataSourceUnitTest.java 88.318% <ø> (ø)
...institute/hellbender/engine/FeatureDataSource.java 74.603% <57.143%> (+24.603%) ⬆️
...ender/engine/cache/SideReadInputCacheStrategy.java 81.481% <81.481%> (ø)
...adinstitute/hellbender/engine/ReadsDataSource.java 91.667% <84.615%> (+33.333%) ⬆️
...ellbender/engine/VariantWalkerIntegrationTest.java 87.288% <86.667%> (ø)
...engine/cache/DrivingFeatureInputCacheStrategy.java 88.000% <88.000%> (ø)
...ellbender/engine/cache/LocatableCacheUnitTest.java 96.471% <96.471%> (ø)
...gumentcollections/ReadInputArgumentCollection.java 87.500% <100.000%> (+20.833%) ⬆️
...org/broadinstitute/hellbender/engine/GATKTool.java 87.719% <100.000%> (+19.135%) ⬆️
...titute/hellbender/engine/cache/LocatableCache.java 100.000% <100.000%> (ø)
... and 1863 more

@droazen droazen self-assigned this Jun 15, 2018
@droazen droazen self-requested a review June 15, 2018 21:21
Copy link
Contributor

@magicDGS magicDGS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments about exposing an argument to the user for the cache...

@Advanced
@Argument(fullName = "reads-look-ahead-window-size", shortName="reads-look-ahead-window-size",
doc = "Window size for reads look-ahead query caching")
protected int readsLookaheadWindowSize = 100 * 1024;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this should be exposed to the user, but being similar to the feature-cache: it should be choosen by the walker-class what is the best value.

*
* @return true if read caching should be used, otherwise false
*/
public boolean useReadCaching() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a getReadsLookaheadWindowSize instead, returning a value with the meaning of disable (0? -1?) and let each walker-class to choose how the cache is handled. A tool can always implement an argument to modify this value.

@cmnbroad
Copy link
Collaborator Author

Rebased this to do some testing with current master.

@droazen droazen modified the milestones: Engine-Q42018, Engine-Q12019 Feb 4, 2019
@cmnbroad cmnbroad force-pushed the cn_cache_read_inputs branch 2 times, most recently from bb9e943 to bedd108 Compare April 19, 2019 00:39
@cmnbroad cmnbroad force-pushed the cn_cache_read_inputs branch 4 times, most recently from 7bbb5d4 to 0cc1c39 Compare February 9, 2022 14:12
***************************************************************************************************/

/********************************************************
* LocatableCache tests using SideReadInputCacheStrategy
Copy link
Collaborator Author

@cmnbroad cmnbroad Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is really testing SideReadInputCacheStrategy behavior, so maybe it should be moved to a SideReadInputCacheStrategyUnitTest class.

@cmnbroad cmnbroad force-pushed the cn_cache_read_inputs branch 2 times, most recently from 42f8ac9 to 3db7949 Compare February 14, 2022 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants