Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add option to invert logic in ExtractOriginalAlignmentRecordsByNameSpark #4944

Merged
merged 1 commit into from
Jun 29, 2018

Conversation

SHuang-Broad
Copy link
Contributor

by turning on flag -v

@SHuang-Broad SHuang-Broad requested a review from vruano June 26, 2018 18:06
@SHuang-Broad SHuang-Broad force-pushed the sh-invert-filter-extract-sam-records branch from 431eeac to 5e9f222 Compare June 26, 2018 20:07
@codecov-io
Copy link

codecov-io commented Jun 26, 2018

Codecov Report

Merging #4944 into master will decrease coverage by <.001%.
The diff coverage is 85.714%.

@@               Coverage Diff               @@
##              master     #4944       +/-   ##
===============================================
- Coverage     80.784%   80.784%   -<.001%     
- Complexity     17957     17960        +3     
===============================================
  Files           1095      1095               
  Lines          64587     64592        +5     
  Branches       10392     10392               
===============================================
+ Hits           52176     52180        +4     
  Misses          8388      8388               
- Partials        4023      4024        +1
Impacted Files Coverage Δ Complexity Δ
...ls/ExtractOriginalAlignmentRecordsByNameSpark.java 86.364% <85.714%> (-1.872%) 9 <4> (+3)

@SHuang-Broad SHuang-Broad force-pushed the sh-invert-filter-extract-sam-records branch 2 times, most recently from 00af35b to 470bdb0 Compare June 29, 2018 17:12
@SHuang-Broad SHuang-Broad requested review from TedBrookings and removed request for vruano June 29, 2018 17:13
private String readNameFile;

@Argument(doc = "revert the list, i.e. filter out reads whose name appear in the give file",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

presumably "revert" should be "invert"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected. Thanks!

@@ -82,14 +87,21 @@ protected void runTool( final JavaSparkContext ctx ) {

final Broadcast<Set<String>> namesToLookForBroadcast = ctx.broadcast(parseReadNames());

final JavaRDD<GATKRead> reads =
getUnfilteredReads().filter(read -> namesToLookForBroadcast.getValue().contains(read.getName())).cache();
final Function<GATKRead, Boolean> predicate = getGatkReadBooleanFunction(namesToLookForBroadcast);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super-picky, ignore if you like: "predicate" is a confusing name to me in this context, presumably meaning "to assert" in this context. Since it's an object, I might suggest "nameFilter" which indicates what it is. Again, this is just a suggestion, feel free to ignore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a predicate (noun form) meaning a boolean valued function of one argument. https://docs.oracle.com/javase/8/docs/api/java/util/function/Predicate.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I did some googling but missed that definition. I now see it's super-standard, so I retract this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying, Chris!
One note to add: returning a Function makes it serializable, which we can use in Spark.
I'm being lazy 🙈 by NOT implementing a class that implements both Predicate and Serializable


final File tempWorkingDir = BaseTest.createTempDir("extractOriginalAlignmentRecordsByNameSparkIntegrationTest");

FileUtils.writeLines(new File(tempWorkingDir, "names.txt"), Collections.singleton("asm013903:tig00002"));

final SAMFileHeader expectedHeader;
final List<SAMRecord> expectedRecords;
List<SAMRecord> expectedRecords;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another nitpick: consider renaming this to final List expectedRecordsNormalMatch (or similar)
and later creating List expectedRecordsInvertedMatch (or similar)
The reason being it took a bit of parsing for me to figure out why there were two blocks here that looked (at first glance) identical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@TedBrookings TedBrookings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks fine except for (I suspect) one typo, and a couple places where I had some suggested but optional renaming.

@SHuang-Broad SHuang-Broad force-pushed the sh-invert-filter-extract-sam-records branch from 470bdb0 to 3d1b2e7 Compare June 29, 2018 20:53
@SHuang-Broad SHuang-Broad force-pushed the sh-invert-filter-extract-sam-records branch from 3d1b2e7 to cf7faea Compare June 29, 2018 20:53
@SHuang-Broad SHuang-Broad merged commit 6129359 into master Jun 29, 2018
@SHuang-Broad SHuang-Broad deleted the sh-invert-filter-extract-sam-records branch June 29, 2018 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants