-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add option to invert logic in ExtractOriginalAlignmentRecordsByNameSpark #4944
Conversation
431eeac
to
5e9f222
Compare
Codecov Report
@@ Coverage Diff @@
## master #4944 +/- ##
===============================================
- Coverage 80.784% 80.784% -<.001%
- Complexity 17957 17960 +3
===============================================
Files 1095 1095
Lines 64587 64592 +5
Branches 10392 10392
===============================================
+ Hits 52176 52180 +4
Misses 8388 8388
- Partials 4023 4024 +1
|
00af35b
to
470bdb0
Compare
private String readNameFile; | ||
|
||
@Argument(doc = "revert the list, i.e. filter out reads whose name appear in the give file", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
presumably "revert" should be "invert"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected. Thanks!
@@ -82,14 +87,21 @@ protected void runTool( final JavaSparkContext ctx ) { | |||
|
|||
final Broadcast<Set<String>> namesToLookForBroadcast = ctx.broadcast(parseReadNames()); | |||
|
|||
final JavaRDD<GATKRead> reads = | |||
getUnfilteredReads().filter(read -> namesToLookForBroadcast.getValue().contains(read.getName())).cache(); | |||
final Function<GATKRead, Boolean> predicate = getGatkReadBooleanFunction(namesToLookForBroadcast); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super-picky, ignore if you like: "predicate" is a confusing name to me in this context, presumably meaning "to assert" in this context. Since it's an object, I might suggest "nameFilter" which indicates what it is. Again, this is just a suggestion, feel free to ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a predicate (noun form) meaning a boolean valued function of one argument. https://docs.oracle.com/javase/8/docs/api/java/util/function/Predicate.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I did some googling but missed that definition. I now see it's super-standard, so I retract this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying, Chris!
One note to add: returning a Function
makes it serializable, which we can use in Spark.
I'm being lazy 🙈 by NOT implementing a class that implements both Predicate
and Serializable
|
||
final File tempWorkingDir = BaseTest.createTempDir("extractOriginalAlignmentRecordsByNameSparkIntegrationTest"); | ||
|
||
FileUtils.writeLines(new File(tempWorkingDir, "names.txt"), Collections.singleton("asm013903:tig00002")); | ||
|
||
final SAMFileHeader expectedHeader; | ||
final List<SAMRecord> expectedRecords; | ||
List<SAMRecord> expectedRecords; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another nitpick: consider renaming this to final List expectedRecordsNormalMatch (or similar)
and later creating List expectedRecordsInvertedMatch (or similar)
The reason being it took a bit of parsing for me to figure out why there were two blocks here that looked (at first glance) identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks fine except for (I suspect) one typo, and a couple places where I had some suggested but optional renaming.
470bdb0
to
3d1b2e7
Compare
3d1b2e7
to
cf7faea
Compare
by turning on flag
-v