Added CLIs and WDL for python gCNV pipeline. #3925

samuelklee · 2017-12-06T14:51:37Z

No description provided.

codecov-io · 2017-12-06T16:02:11Z

Codecov Report

Merging #3925 into master will increase coverage by 0.016%.
The diff coverage is 88.519%.

@@               Coverage Diff               @@
##              master     #3925       +/-   ##
===============================================
+ Coverage     78.769%   78.786%   +0.016%     
+ Complexity     16501     16490       -11     
===============================================
  Files           1065      1075       +10     
  Lines          58788     59026      +238     
  Branches        9578      9597       +19     
===============================================
+ Hits           46307     46504      +197     
- Misses          8754      8788       +34     
- Partials        3727      3734        +7

Impacted Files	Coverage Δ	Complexity Δ
.../formats/collections/ModeledSegmentCollection.java	`69.767% <ø> (ø)`	`3 <0> (ø)`	⬇️
...ls/copynumber/utils/CombineSegmentBreakpoints.java	`89.796% <ø> (ø)`	`15 <0> (ø)`	⬇️
...formats/collections/HDF5SimpleCountCollection.java	`90% <ø> (ø)`	`14 <0> (ø)`	⬇️
...er/formats/collections/AllelicCountCollection.java	`100% <ø> (ø)`	`5 <0> (ø)`	⬇️
...collections/MultidimensionalSegmentCollection.java	`42.857% <ø> (ø)`	`2 <0> (ø)`	⬇️
...ber/formats/collections/SimpleCountCollection.java	`100% <ø> (ø)`	`11 <0> (ø)`	⬇️
...ormats/collections/CopyRatioSegmentCollection.java	`80% <ø> (ø)`	`4 <0> (ø)`	⬇️
...s/collections/AlleleFractionSegmentCollection.java	`45.455% <ø> (ø)`	`2 <0> (ø)`	⬇️
...der/tools/copynumber/models/CopyRatioModeller.java	`88.525% <ø> (ø)`	`10 <0> (ø)`	⬇️
...ools/copynumber/models/AlleleFractionModeller.java	`92.754% <ø> (ø)`	`17 <0> (ø)`	⬇️
... and 121 more

mbabadi · 2017-12-07T18:07:52Z

@samuelklee awh hehe... will get rid of that column

mbabadi · 2017-12-08T02:55:21Z

@samuelklee gCNV integration tests and WDL tests pass locally. There are a number of issues:

Somatic WDL tests fail on travis, and haven't tried them locally. Those tests were failing before I made changes -- it's unlikely to be related to my changes. Could you please take a look?
Germline WDL tests are extremely slow on travis, perhaps there is no g++ and it is using python fallback? I don't know how to see the cromwell log files on travis. @lbergelson any pro tips?
Python-related integration tests all fail on travis (though not locally w/ a properly setup python environment).
I noticed that the python env in Dockerfile is based on Miniconda2. We strictly require Python 3 for gCNV.

samuelklee · 2017-12-08T03:11:28Z

I’ll take a look at the somatic tests. They should be OK, probably just something related to kebab case changes. EDIT: Or hmm, maybe they weren't passing before. Something to do with annotated-interval validation, I think.

I think the WDL tests should be using the Docker, which has g++. Travis machines might be slower?

Integration tests will need to be in the python test group. Take a look at the python tests.

samuelklee · 2017-12-08T05:51:40Z

OK, I think I figured out what was going on. In 9b194a6 I changed PreprocessIntervals to drop intervals with all Ns (if you remember, this was giving me NaNs in AnnotateIntervals, which gCNV didn't like). But I must not have rebuilt the somatic WGS PoNs and updated the copies in the large test resources. I could've sworn that I tested the somatic pipeline locally, but perhaps I forgot to update the jar at some point.

Ideally, we should figure out some way to use the PoNs built by the panel WDL tests in the subsequent tests for the case/pair workflows.

samuelklee · 2017-12-08T05:59:46Z

@mbabadi I pushed new PoNs, I think that should get somatic tests to pass. Sorry about that, hope you didn't go too far down the rabbit hole trying to debug!

If you need to make changes to the Docker or Travis environments, perhaps coordinate with @cmnbroad to make them directly in #3912, which is still open.

mbabadi · 2017-12-08T06:55:50Z

@samuelklee haha I just figured it out and was about to write to you. Hopefully the tests will pass now. The changes I made to travis yml were experimental -- I am reverting them.

mbabadi · 2017-12-08T06:56:32Z

(except for Miniconda2 -> Miniconda3)

mbabadi · 2017-12-08T07:36:38Z

@samuelklee darn; the wgs pon related tests are still failing and the germline WDL exceeds the time limit (it completes in ~ 10 mins on gsa5, which is already quite slow, and ~ 3 mins on my laptop!). Something strange is going on. I wish there was an easy way to inspect the cromwell logs; perhaps there's a way to set cromwell logging to stdout.

mbabadi · 2017-12-08T07:40:42Z

@samuelklee OK it is not a cromwell issue -- after your commit, the somatic denoising integration tests are failing locally too:
java.lang.IllegalArgumentException: Sample intervals must be identical to the original intervals used to build the panel of normals.

I wouldn't worry about it now. It is most likely related to test resource files.

samuelklee · 2017-12-08T12:07:06Z

@mbabadi I think the somatic tests are fixed: https://travis-ci.org/broadinstitute/gatk/builds/313333659?utm_source=email&utm_medium=notification Did you pull my commit?

Feel free to cut out some of the test cases if the WDL tests are too slow. Especially now that there is no real distinction between WGS/WES besides -L (except for in PreprocessIntervals, which integration tests should cover).

samuelklee · 2017-12-13T18:27:24Z

scripts/cnv_wdl/germline/README.md

+- ``CNVGermlineCohortWorkflow.cohort_entity_id`` -- Name of the cohort.  Will be used as a prefix for output filenames.
+- ``CNVGermlineCohortWorkflow.contig_ploidy_priors`` -- TSV file containing prior probabilities for the ploidy of each contig, with column headers: CONTIG_NAME, PLOIDY_PRIOR_0, PLOIDY_PRIOR_1, ...
+- ``CNVGermlineCohortWorkflow.gatk_docker`` -- GATK Docker image (e.g., ``broadinstitute/gatk:x.beta.x``).


Update this.

samuelklee · 2017-12-15T20:04:00Z

Rebased now that sl_wgs_acnv, sl_wgs_acnv_headers, and sl_delete_legacy_cnv code are all in. Phew!

samuelklee · 2018-01-02T16:23:19Z

Rebased and squashed on top of sl_wgs_acnv_headers_docs. Here is the log of squashed commits, for reference:

commit 3eda4b18888f38249be39f99901d8453a4de50d6
Author: Samuel Lee <[email protected]>
Date:   Thu Dec 28 14:56:27 2017 -0500

    updated command lines for WDL tests for C29

commit 7ce1369943cce4ae9cfb5e96455d18d3960e9b77
Author: Samuel Lee <[email protected]>
Date:   Thu Dec 28 13:30:21 2017 -0500

    use C29 and decrease gcnv_max_training_epochs

commit 68772cba486b44ebc8cf8bfc2b600c1e8a406c61
Author: Mehrtash Babadi <[email protected]>
Date:   Sun Dec 17 19:20:05 2017 +0330

    documentation update of GermlineCNVCaller and DetermineGermlineContigPloidy

commit c032281f8c43a80e4ec8cb96eb66397ad2acf9b7
Author: Samuel Lee <[email protected]>
Date:   Fri Dec 15 18:14:35 2017 -0500

    Fixed imr kebab case in WDL, moved argument classes, removed GenomeLocParser, fixed up gCNV WDL readme.

commit be84a804f6ab6fbb815995db9c116d1db950ab8b
Author: Samuel Lee <[email protected]>
Date:   Tue Dec 12 13:55:08 2017 -0500

    removed extra comma in gCNV Case WDL test JSON

commit cb379b866d425f12f5525ecb28ad0b636a528d44
Author: Samuel Lee <[email protected]>
Date:   Tue Dec 12 12:58:59 2017 -0500

    added missing cpu parameters to gCNV Case WDL tasks

commit eed85f6c70f4a7f15e0765b5f15a1bf8541c151e
Author: Samuel Lee <[email protected]>
Date:   Tue Dec 12 11:26:31 2017 -0500

    disabled some gCNV WDL tests

commit 6d8ca07fef41518b5b157fb9a214d4536c617156
Author: Samuel Lee <[email protected]>
Date:   Tue Dec 12 10:54:54 2017 -0500

    fixed DenoiseReadCountsIntegrationTest files

commit adfbef12f2ab90f93b49a4f786979549648e1f22
Author: Mehrtash Babadi <[email protected]>
Date:   Mon Dec 11 02:22:56 2017 -0500

    removed CNV evaluation code from this branch

commit 18c8d31f39a1964474c5d7b12ee8cbfafc4ac9e2
Author: Mehrtash Babadi <[email protected]>
Date:   Sun Dec 10 00:19:58 2017 -0500

    GS VCF parser outputs dict for samples instead of list

commit b138be39cd8428342668ee6678079021006f983b
Author: Mehrtash Babadi <[email protected]>
Date:   Sun Dec 10 00:15:19 2017 -0500

    renaming

commit eab5c90b74b4eb6bd11acb0fd1e0fa58a3b5b0c7
Author: Mehrtash Babadi <[email protected]>
Date:   Fri Dec 8 16:23:40 2017 -0500

    exposed a global preemptible_attempts to gCNV workflows
    set OMP_NUM_THREADS and MKL_NUM_THREADS to the number of requested CPUs

commit ad6fe348d6a7896c169b2b0499e2a4bca34021ad
Author: Mehrtash Babadi <[email protected]>
Date:   Fri Dec 8 10:21:25 2017 -0500

    reverted log level in germline CNV tests

commit d9eb4e504baab834a9efc07cc3479176db2946ce
Author: Mehrtash Babadi <[email protected]>
Date:   Fri Dec 8 10:20:30 2017 -0500

    the proper python environment yml for mkl and open -- leads to orders of magnitude speedup!

commit fea6bf874e0b62262a3b1d239ce4d76792d5c416
Author: Mehrtash Babadi <[email protected]>
Date:   Fri Dec 8 09:31:43 2017 -0500

    revert

commit 456c53f88d01b603f4175d8896a0dac036af03f8
Author: Mehrtash Babadi <[email protected]>
Date:   Fri Dec 8 08:17:22 2017 -0500

    enabled openmp g++ linking in theano

commit e2afef14ddb957f2dbdea76fd783d3bfb8d7a64e
Author: Mehrtash Babadi <[email protected]>
Date:   Fri Dec 8 08:04:19 2017 -0500

    mkl

commit 43e2a65201286161fcd5bfe7dbb21ae888e19dac
Author: Mehrtash Babadi <[email protected]>
Date:   Fri Dec 8 06:56:20 2017 -0500

    added cpu argument for germline tasks

commit 4433a62c2173c7f29d0f264c084bbaf2f6738782
Author: Mehrtash Babadi <[email protected]>
Date:   Fri Dec 8 02:45:38 2017 -0500

    revert travis yml forks
    verbose logging germline wdl

commit ae05801e33c37b3bf2685fba202032a270804873
Author: Samuel Lee <[email protected]>
Date:   Fri Dec 8 00:55:14 2017 -0500

    updated somatic PoNs for PreprocessIntervals drop Ns

commit cff64984d9fb42364001bda4c73d54cf68d85a5c
Author: Mehrtash Babadi <[email protected]>
Date:   Fri Dec 8 00:37:24 2017 -0500

    sudo travis yml

commit 89025941febd2089d426cfa1e0f0aa6a6712e2a9
Author: Mehrtash Babadi <[email protected]>
Date:   Fri Dec 8 00:23:22 2017 -0500

    travis/Docker config update (g++-6, Miniconda3)
    python test group assignment

commit 31f96398106c2b8577b8c25d110abea3ebe7f836
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 20:44:53 2017 -0500

    WDL test bugfix

commit 9b2fb820536ec355bea0256471bd093d547f5c99
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 20:20:36 2017 -0500

    update WDL test JSON files

commit e3d97644d1a2c40a5c364a96f8b67246154179c9
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 20:18:14 2017 -0500

    assertions in inference task base
    removed a ASCII > 128 character in log messages

commit 526cf92e623a3bbd5f9d375132b6ca046fc47620
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 20:03:04 2017 -0500

    redirect tqdm progress bar to python logger

commit 2e45bd30968b921fae225de3901fb97ece690b0c
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 19:45:49 2017 -0500

    more arg related fixes

commit bb89a3bb338d88199881e8aca65f656f2acd7c0a
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 19:41:20 2017 -0500

    arg related bugfixes in WDL, python, and java CLIs

commit 23569787ee2c8cc6c9227a44170cbbd02fe4427f
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 17:21:05 2017 -0500

    fixed issue with python boolean argparse (they use weird semantics)

commit ae841c9ed4cd9b2ca1ac0e9082d175ff8ea98298
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 16:44:02 2017 -0500

    shorter gCNV WDL tests

commit 5466b806e36df16cad2d045be074e7f9afec0957
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 16:38:15 2017 -0500

    fixed arg issues in somatic WDL
    exposed all missing args to java side
    major update to germline WDLs
    all optional python args exposed to WDLs as optional args

commit 50cb6fd08de15469a9080cbb27ff30c8b7ee7e21
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 13:50:45 2017 -0500

    missing serialVersionUID

commit 5f0f31eab63b0e6f6105708ded7f86c96c830781
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 13:35:33 2017 -0500

    annotated intervals kebab case
    updated germline WDL workflows

commit 29cc6234dbfb8db12559217a650c6ceb170c5797
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 13:15:28 2017 -0500

    cleanup test files

commit 08a35bb4e65eceb735adcd41a91132e9a34d2b66
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 02:50:19 2017 -0500

    update WDL scripts

commit 12bcfa192ee6fa6da21239ebf5b513633efe974f
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 02:47:33 2017 -0500

    significant updates to GermlineCNVCaller
    integration tests for GermlineCNVCaller w/ sim data in both run modes

commit 151416a4af735ca721bd75e4b54a780c17ac9397
Author: Mehrtash Babadi <[email protected]>
Date:   Thu Dec 7 01:42:05 2017 -0500

    hybrid ADVI abstract argument collection w/ flexible default values
    hybrid ADVI argument collection for contig ploidy model
    hybrid ADVI argument collection for germline denoising and calling model

commit 56e21bf955d3dc0c52aceb384f28cf6173959de0
Author: Mehrtash Babadi <[email protected]>
Date:   Wed Dec 6 23:18:39 2017 -0500

    rewritten python-side coverage metadata table reader using pandas to fix the issues with comment line
    change criterion for cohort/case based on whether a contig-ploidy model is provided or not
    simulated test files for ploidy determination tool
    proper integration test for ploidy determination tool and all edge cases
    updated docs for ploidy determination tool

commit 7fa104b2e9170770cfc5b338835e41215d7fd39c
Author: Mehrtash Babadi <[email protected]>
Date:   Wed Dec 6 18:43:17 2017 -0500

    kabab case for gCNV-related tools
    removed short args (this also partially affected PlotDenoisedCopyRatios and PlotModeledSegments and their integration tests)

commit f02cb024331a986213cfd9fae2da706bbc5ddbd9
Author: Mehrtash Babadi <[email protected]>
Date:   Wed Dec 6 18:02:40 2017 -0500

    synced with mb_gcnv_python_kernel

commit 2963bbf8c90418d9b88545c93771ae51cf542db9
Author: Samuel Lee <[email protected]>
Date:   Wed Dec 6 11:38:05 2017 -0500

    Fixing typo in travis.yml

commit 6cf589999c716ec66404eb0a2ae4310dd130a772
Author: Samuel Lee <[email protected]>
Date:   Wed Dec 6 11:13:58 2017 -0500

    editable, full path

commit d998f2d5c2b33dd41e291be9bfeaea72fe479b8a
Author: Samuel Lee <[email protected]>
Date:   Wed Dec 6 10:56:24 2017 -0500

    revert Dockerfile, change yml

commit 930d7486b7d2cf918fcb16dd03394bb9c9f0611b
Author: Samuel Lee <[email protected]>
Date:   Wed Dec 6 10:34:46 2017 -0500

    more Dockerfile

commit 94112131526b514ef254bcc2c50a239dbae35aa1
Author: Samuel Lee <[email protected]>
Date:   Wed Dec 6 10:25:13 2017 -0500

    more Dockerfile

commit 7d2646240a65f5c0f09f5f25f3e19e9d9bf004d9
Author: Samuel Lee <[email protected]>
Date:   Wed Dec 6 10:06:11 2017 -0500

    more Dockerfile

commit f1235c25aeba85570b5ce389a34975f1b7b5ec3c
Author: Samuel Lee <[email protected]>
Date:   Wed Dec 6 09:39:46 2017 -0500

    Dockerfile edit

commit 3df84dd4693f28e4e8b34fd33f877e99749dffce
Author: Samuel Lee <[email protected]>
Date:   Tue Dec 5 16:08:06 2017 -0500

    Update test PoNs

commit 2c3b20e62a1cba7af24c0b0846eb1629422f51e6
Author: Samuel Lee <[email protected]>
Date:   Tue Dec 5 15:49:38 2017 -0500

    Update test files

commit c65c6e9144ef396792364ab2e06b7b436bb97684
Author: Samuel Lee <[email protected]>
Date:   Tue Dec 5 15:30:59 2017 -0500

    Adding no-GC/do-GC WDL tests.

commit 56451843066a456d9cf8e6eac55ae4df2c518ec3
Author: Samuel Lee <[email protected]>
Date:   Tue Dec 5 12:51:17 2017 -0500

    Updates to handle SAM header changes from sl_wgs_acnv_headers and updates to mb_gcnv_python_kernel.

commit d02d04df684a2820308a1d1c2bfda4b7d1c5f05e
Author: Samuel Lee <[email protected]>
Date:   Mon Nov 13 12:52:33 2017 -0500

    Added CLIs and WDL for python gCNV pipeline.

commit 66ed74b68375d43514ef84658e7a6c771ed9053c
Author: Mehrtash Babadi <[email protected]>
Date:   Wed Nov 15 01:50:03 2017 -0500

    Polished code, ready for review
    
    gCNV computational kernel (initial release)
    
    renaming gammas_s to psi_s to uniformity (sample-specific unexplained variance)
    
    renamed determine_ploidy_and_depth.py to cohort_determine_ploidy_and_depth.py
    finite-temperature forward-backward algorithm
    in the ploidy model, replaced alpha_j (NB over-dispersion) with psi_j (unexplained variance) for uniformity. Also, added the possibility of sample-specific unexplained variance in the germline contig ploidy model
    
    updated I/O routines and CLIs according to team discussion
    
    updated I/O routines and CLIs according to team discussion
    
    changed the output layout of the ploidy determination tool
    refactored parts of io.py
    upped the version to 0.3 as it is not backwards compatible anymore
    
    case ploidy determination tool from a given ploidy model
    major code cleanup and refactoring of I/O module
    refactoring of common CLI script snippets
    
    removed all "targets"
    some code cleanup
    
    pad flat class bitmask w/ a given padding value in the hybrid q_c_expectation_mode
    option to disable annealing and keep the temperature fixed
    
    bugfix in finite-temperature forward-backward
    further refactoring of model I/O
    
    the option to take a previously trained model as starting point in cohort CLI
    the option to take previous calls as a starting point in cohort CLI
    
    option to save and load adamax moments
    
    import/export adamax bias correction tensor
    
    refactoring related to fancy opt I/O
    added average ploidy column to read depth
    updated docs of hybrid inference
    
    modeling intervals can span multiple contigs now; ploidy can change
    across contigs with no issue
    
    save/load adamax state to .npy instead of .tsv for speed
    
    part 1 of doc updates
    
    part 2 of doc updates
    
    part 3 of doc updates
    
    part 4 of doc updates
    
    bumped version to 0.5
    readme
    
    update readme
    
    last minute stylistic doc updates.

samuelklee · 2018-01-02T16:48:09Z

Split into commits to make things easier to review. The first commit, which is from sl_wgs_acnv_headers_docs, can be ignored and should be removed after that branch is merged. This should be ready for review, but the tests may fail if they run too long; we can trim them more after review.

samuelklee · 2018-01-03T14:34:50Z

@sooheelee There is a bit of Javadoc for you to review in the "Added Java CLIs for python gCNV" commit.

sooheelee · 2018-01-03T16:08:52Z

Ok @samuelklee. I will take a look.

samuelklee

About halfway through the python, the following remain:

   244 src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/commons.py
    63 src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/dists.py
    73 src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/fancy_model.py
  1168 src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/model_denoising_calling.py
   364 src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/model_ploidy.py
   211 src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/theano_hmm.py
   757 src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/tasks/inference_task_base.py

samuelklee · 2018-01-03T15:31:55Z

src/main/resources/org/broadinstitute/hellbender/tools/copynumber/case_denoising_calling.py

@@ -0,0 +1,210 @@
+import os


We need a strategy for syncing these scripts from src/main/python/org/broadinstitute/hellbender/tools/gcnv/bin to src/main/resources/org/broadinstitute/hellbender/tools/copynumber. Perhaps just move everything to the latter.

The easiest solution is to let the scripts be treated as Java resources and not distributing them with gcnvkernel. I will come back to this in a bit.

samuelklee · 2018-01-03T15:41:38Z

scripts/gatkcondaenv.yml

  - werkzeug==0.12.2
-
+  - "--editable=/gatk/src/main/python/org/broadinstitute/hellbender/tools/gcnv/"


@cmnbroad Could you take a look at these updates to the environment and let us know if you have any issues? Also, if you have any thoughts about more general python issues (e.g., is src/main/python a good place for the package, or should we put everything in src/main/resources?), please feel free to comment.

@samuelklee The env changes mostly seem fine (@lucidtronix - are you using agparse ?), except for the last line. Rather than the conda .yml be dependent on the repo structure, we should remove that last line, and instead rely on the changes implemented in #3964. It basically bundles up everything under src/main/python/org/broadinstitute/hellbender/ into a zip file which is then pip installed as part of the conda env, along with all of the other dependencies.

The only reason that #3964 currently says "DO NOT MERGE" is because I included a commit with a dummy python package, and a corresponding test, so I could validate that it works on travis. Let me know if you want to cherry-pick the core change, or we could merge that PR as is if you want to rebase on top of it.

So, I think most python code should live under src. You could also have tool-specific code that is kept as a tool resource to make it easy to call from Java, but that would be optional.

OK, that all makes sense. I’m fine with merging that PR and rebasing this one on top of it.

Ok, its merged.

I pushed a copy of this branch that is rebased on master to https://github.com/broadinstitute/gatk/tree/sl_gcnv_ploidy_cli_rebased to make sure we can use the changes by @cmnbroad mentioned above. @mbabadi see the last commit in that branch for an example of how we might reorganize things.

I removed the duplicate copies of the gcnv/bin scripts (since these are already in src/main/resources), moved the gcnvkernel package to the same level as the gatkpython dummy package, and used find_packages() to install both as a single gatkpythonpackages package. I retained a (slightly modified) copy of @mbabadi's original setup.py for gcnvkernel, which one can use to install that package separately should one choose to.

Not sure if this is how we were planning to organize things (and I must admit that I'm not familiar with the subtleties and conventions of python packaging), but it seems to work---we'll see if tests pass in that copy of the branch.

Looks like tests run successfully in that branch, so what I did indeed works. (Unfortunately, the tests don't pass, but that's because they are still running long...)

I'll come back to this in a bit.

quick note @cmnbroad: argparse is a core python3 package now and does not need to be installed separately.

Oh ok, got it.

samuelklee · 2018-01-03T15:54:36Z

src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/structs/interval.py

+        return padded_interval
+
+    def overlaps_with(self, other):
+        assert isinstance(other, Interval), "{0} is not of Interval type".format(other)


Any reason why you perform this check here and not in other methods? If it's not absolutely necessary, I'd be fine with dispensing of checks of this kind.

Removed. That's a main caveat of dynamic typing. No checks and you may get weird errors; and no one forces you to check so the get codes inconsistent...

samuelklee · 2018-01-03T15:55:29Z

src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/structs/interval.py

+        return "GC_CONTENT"
+
+
+class NameAnnotation(IntervalAnnotation):


Do we still need this?

No harm, but I will drop it to make you happy :D

samuelklee · 2018-01-03T16:07:05Z

src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/structs/interval.py

+        else:
+            return np.abs(self.get_midpoint() - other.get_midpoint())
+
+    def __eq__(self, other):


Are these comparison operators used? Does this imply that intervals should be lexicographically ordered by default? If so, let's remove if possible.

I do not assume well-ordered intervals in gcnvkernel (no sort op, etc.). In the evaluation suite, I rely on IntervalTree's own ordering. So I think I can safely drop the comparison ops.

samuelklee · 2018-01-03T18:22:17Z

src/main/python/org/broadinstitute/hellbender/tools/gcnv/setup.py

+
+assert sys.version_info >= (3, 4), "gcnvkernel requires Python 3.4.x or later"
+
+VERSIONFILE="gcnvkernel/_version.py"


Whitespace around =.

Pycharm "Inspect code" also reveals some other PEP8 violations and unused local variables, which should be easy to clean up.

True, I just took a template and edited. Fixed PEP8 violations.

samuelklee · 2018-01-03T18:29:37Z

...n/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/inference/covergence_tracker.py

@@ -0,0 +1,71 @@
+import numpy as np


Typo in the name of this file.

samuelklee · 2018-01-03T18:50:22Z

...esources/org/broadinstitute/hellbender/tools/copynumber/cohort_determine_ploidy_and_depth.py

+                   type=str,
+                   required=True,
+                   default=argparse.SUPPRESS,
+                   help="Output path to write the ploidy model for future single-sample ploidy determination use")


Do you need to specify "single-sample" here, since case mode works on multiple cases?

Remember to keep the copy in src/main/resources synced, if necessary.

samuelklee · 2018-01-03T18:55:44Z

...hon/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/tasks/task_case_denoising_calling.py

+        return self.copy_number_update_reduced
+
+
+class CopyNumberEmissionSampler(Sampler):


Is this duplicated in the cohort task?

Yep. Removed.

samuelklee · 2018-01-03T19:17:16Z

src/main/python/org/broadinstitute/hellbender/tools/gcnv/bin/case_denoising_calling.py

+gcnvkernel.HybridInferenceParameters.expose_args(parser)
+
+
+def update_args_dict_from_exported_model(input_model_path: str,


A general comment: Use of type hinting is great but somewhat inconsistent. Can you file an issue to go back and clean up?

Will do :) it is hard to be too consistent since using most external modules (even numpy) breaks type inference. One conservative (though, verbose) strategy is to hint everything, as if we are coding in a statically typed language.

sooheelee

Reviewed the subset of changes on the "Added Java CLIs for python gCNV" commit. The two tools in the commit, DetermineGermlineContigPloidy and GermlineCNVCaller, are missing the BetaFeature tag and could provide usage examples that are more informative. For example, what is the rule of thumb for increasing Xmx memory per sample added to the cohort?

sooheelee · 2018-01-03T22:58:58Z

src/main/java/org/broadinstitute/hellbender/tools/copynumber/DetermineGermlineContigPloidy.java

+ *
+ * <p>COHORT run-mode:</p>
+ * <pre>
+ * gatk-launch --javaOptions "-Xmx4g" DetermineGermlineContigPloidy \


--javaOptions "-Xmx4g" --> --javaOptions "-Xmx100g"?

Is this one of those cases where we should specify the Xmx parameter? Does the tool require a lot of compute? Meaning could the tool err without it being specified? If so, it would be great for the Xmx parameter to reflect a larger value. For example, I had to use -Xmx100g for 40 exome samples with GermlineCNVCaller. Can we provide a rule of thumb on what is sufficient memory?

If the tool no longer requires a lot of compute, then we omit Xmx.

sooheelee · 2018-01-03T23:02:28Z

src/main/java/org/broadinstitute/hellbender/tools/copynumber/DetermineGermlineContigPloidy.java

+import htsjdk.samtools.SAMSequenceDictionary;
+import org.broadinstitute.barclay.argparser.Advanced;
+import org.broadinstitute.barclay.argparser.Argument;
+import org.broadinstitute.barclay.argparser.ArgumentCollection;


This appears to be a new tool. Does it need the BetaFeature tag?

import org.broadinstitute.barclay.argparser.BetaFeature;

sooheelee · 2018-01-03T23:06:20Z

src/main/java/org/broadinstitute/hellbender/tools/copynumber/DetermineGermlineContigPloidy.java

+ *      path and must be not provided again.</dd>
+ * </dl>
+ *
+ * <h3>Examples</h3>


Examples --> Usage examples

sooheelee · 2018-01-03T23:07:04Z

src/main/java/org/broadinstitute/hellbender/tools/copynumber/DetermineGermlineContigPloidy.java

+ *
+ * <p>CASE run-mode:</p>
+ * <pre>
+ * gatk-launch --javaOptions "-Xmx4g" DetermineGermlineContigPloidy \


See my comment above on Xmx

sooheelee · 2018-01-03T23:08:58Z

src/main/java/org/broadinstitute/hellbender/tools/copynumber/GermlineCNVCaller.java

+ * than 10000 consecutive intervals spanning at least 10 - 50 mb.</p></dd>
+ * </dl>
+ *
+ * <h3>Examples</h3>


--> Usage examples

sooheelee · 2018-01-03T23:09:47Z

src/main/java/org/broadinstitute/hellbender/tools/copynumber/GermlineCNVCaller.java

+ *
+ * <p>COHORT run-mode:</p>
+ * <pre>
+ * gatk-launch --javaOptions "-Xmx4g" GermlineCNVCaller \


See my comment on Xmx above. It would be great to have some rule of thumb on how much memory is needed for X number of WES/WGS samples.

Tool doc says --run-mode is also a required argument so we should include these in the example commands.

sooheelee · 2018-01-03T23:13:24Z

src/main/java/org/broadinstitute/hellbender/tools/copynumber/GermlineCNVCaller.java

+ * </dl>
+ *
+ * <h3>Important Remarks</h3>
+ * <dl>


It would be great if the usage example commands showcased parameters we expect users to commonly tweak, given certain types of data. Can we that the default parameters are meant for whole-genome sequences? What would a WES command look like?

We're still running internal evaluations to determine the best values of parameters, many of which determine priors for the gCNV model. We'll likely set tweak some of the default values to work well for data generated at the Broad, but as @mbabadi warns elsewhere in the documentation, there is no getting around the user having to do some experimentation to set these for their own data.

More generally, as our tools and models become more complex, we cannot expect that they'll run out of the box with one-size-fits-all parameters on all data. Hopefully we can avoid giving this impression in documentation as well.

Empirical example commands of what we use in production, plus a general description of the data type, would be great to showcase in the usage examples. We can update these down the road when you are all settled with evals.

@sooheelee @samuelklee While I agree with the complexity of the models and the inevitable non-universality of parameters, I am still hopeful that we can find decent defaults for WGS, WGS, and gene panel data. It requires extensive evaluation on Broad and non-Broad data and is beyond the scope of the first release IMO. At least, we will try to ship the first version with non-garbage default parameters :D

sooheelee · 2018-01-03T23:14:52Z

src/main/java/org/broadinstitute/hellbender/tools/copynumber/GermlineCNVCaller.java

+ *   -L intervals.interval_list \
+ *   --model previous_model_path \
+ *   --input normal_1.counts.hdf5 \
+ *   ... \


And --run-mode.

sooheelee · 2018-01-03T23:16:30Z

src/main/java/org/broadinstitute/hellbender/tools/copynumber/GermlineCNVCaller.java

+import org.broadinstitute.barclay.argparser.Advanced;
+import org.broadinstitute.barclay.argparser.Argument;
+import org.broadinstitute.barclay.argparser.ArgumentCollection;
+import org.broadinstitute.barclay.argparser.CommandLineProgramProperties;


Is this tool out of Beta status now?

This is a complete python-based rewrite of the old Spark-based GermlineCNVCaller, so it should still be marked Beta.

sooheelee · 2018-01-03T23:23:35Z

I've put in my two cents for the two tool docs @samuelklee.

samuelklee · 2018-01-03T23:30:14Z

Great, thanks! Also, I think I commented above that both tools will indeed be beta; @mbabadi will add these tags.

samuelklee

OK, done with my review. Great job, @mbabadi! The python code looks very clean, and the WDL looks to be in great shape as well.

I am not sure how far along @MartonKN and @asmirnov239 are with their reviews, but let's try to get this merged ASAP. We need to get sl_preprocess in with some time to spare on Friday (since we will have to make a few last-minute WDL changes on top of that before cutting the prerelease), and this branch needs to go in before that one. Most of my comments here are minor and can be addressed after release if necessary, just be sure to file issues for anything you punt.

samuelklee · 2018-01-04T17:35:32Z

src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/dists.py

+import theano.tensor as tt
+
+
+class PositiveFlatTop(PositiveContinuous):


Is this used anywhere?

Not anymore, but it is a useful distribution and I think I will end up using it at some point. I'll keep it here for now.

samuelklee · 2018-01-04T17:36:28Z

src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/commons.py

+                              mu > 0, value >= 0, alpha > 0)
+
+
+def negative_binomial_gaussian_approx_logp(mu, alpha, value):


Is this used anywhere? Same for the next two methods.

More generally, I used vulture to find the following list of possibly unused methods. Please use your discretion to clean up:

src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/__init__.py:21: unused import 'IntervalListMask' (90% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/config.py:5: unused variable 'log_eps' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/inference/deterministic_annealing.py:24: unused function 'apply' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/inference/fancy_optimizers.py:78: unused attribute 'epsilon' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/inference/param_tracker.py:68: unused function 'clear' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/io/io_denoising_calling.py:42: unused function '_export_dict_to_json_file' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/io/io_intervals_and_counts.py:156: unused function 'write_interval_list_to_tsv_file' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/io/io_metadata.py:17: unused function 'write_sample_coverage_metadata' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/commons.py:38: unused function 'poisson_logp' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/commons.py:76: unused function 'negative_binomial_gaussian_approx_logp' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/commons.py:96: unused function 'negative_binomial_smart_approx_logp' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/commons.py:111: unused function 'centered_heavy_tail_logp' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/dists.py:31: unused variable 'point' (100% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/dists.py:31: unused variable 'repeat' (100% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/dists.py:40: unused function '_repr_latex_' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/dists.py:53: unused attribute '_default' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/dists.py:56: unused variable 'point' (100% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/dists.py:56: unused variable 'repeat' (100% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/dists.py:62: unused function '_repr_latex_' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/model_denoising_calling.py:617: unused function 'get_init_psi_t' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/model_denoising_calling.py:641: unused function 'get_init_psi_t' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/model_denoising_calling.py:822: unused attribute 'model_config' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/model_ploidy.py:52: unused attribute 'unordered_contig_list' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/theano_hmm.py:185: unused variable 'alpha_updates' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/theano_hmm.py:197: unused variable 'beta_updates' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/preprocess/interval_list_mask.py:13: unused class 'IntervalListMask' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/preprocess/interval_list_mask.py:32: unused function 'get_masked_view' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/preprocess/interval_list_mask.py:52: unused function 'keep_only_given_contigs' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/preprocess/interval_list_mask.py:66: unused function 'drop_blacklisted_intervals' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/preprocess/interval_list_mask.py:80: unused function 'drop_cohort_wide_uncovered_intervals' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/preprocess/interval_list_mask.py:96: unused function 'drop_intervals_with_anomalous_coverage' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/structs/metadata.py:74: unused function 'get_contig_total_count' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/structs/metadata.py:78: unused function 'get_total_count' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/structs/metadata.py:81: unused function 'generate_sample_coverage_metadata' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/structs/metadata.py:128: unused function 'get_contig_ploidy_genotyping_quality' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/structs/metadata.py:208: unused function 'get_average_ploidy' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/structs/metadata.py:283: unused function 'get_sample_read_depth_array' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/structs/metadata.py:287: unused function 'get_sample_contig_ploidy_array' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/tasks/inference_task_base.py:105: unused function 'flush' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/tasks/inference_task_base.py:315: unused attribute 'latest_caller_update_summary' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/tasks/inference_task_base.py:480: unused attribute 'latest_caller_update_summary' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/tasks/task_case_denoising_calling.py:52: unused attribute 'copy_number_update_s' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/tasks/task_case_denoising_calling.py:53: unused attribute 'copy_number_log_likelihoods_s' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/tasks/task_cohort_denoising_calling.py:52: unused attribute 'copy_number_update_s' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/tasks/task_cohort_denoising_calling.py:53: unused attribute 'copy_number_log_likelihoods_s' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/tasks/task_cohort_denoising_calling.py:55: unused attribute 'class_log_likelihood' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/types.py:26: unused variable 'TheanoScalar' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/utils/cli_commons.py:24: unused function '_get_default_metavar_for_optional' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/utils/cli_commons.py:27: unused function '_get_default_metavar_for_positional' (60% confidence) src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/utils/rls.py:18: unused attribute '_lambda' (60% confidence)

Huh, this was pretty useful, thanks! the confidence levels are interesting -- it is impossible to know for sure whether an apparently unused attribute is truly unused. In fact, it has made a quite a few errors.

I removed poisson_logp but kept negative_binomial_gaussian_approx_logp (will make an issue about it -- there's a potential for 2~3x speedup if we can pull it off with the right hack).

samuelklee · 2018-01-04T18:16:24Z

...main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/tasks/inference_task_base.py

@@ -0,0 +1,757 @@
+import numpy as np


Filed #4043.

samuelklee · 2018-01-04T18:31:35Z

...main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/tasks/inference_task_base.py

+        _logger.debug('The {0} for epoch {1} successfully finished in {2:.2f}s'.format(
+            task_name, i_epoch, self._t1 - self._t0))
+
+    def _log_interrupt(self, task_name: str, i_epoch: int):


Just curious, why do you take special precautions for KeyboardInterrupt? Is this a TQDM thing?

It is very useful for interactive mode. I don't think it would have any effect if the scripts are run via PythonScriptExecutor since stdin is not piped (is it @cmnbroad?)

Right. Technically I think you could write to stdin and the underlying process controller would honor it, but the PythonScriptExecutor doesn't rely on it (since you always run a script, module, or command line command). Thats what StreamingPythonScriptExecutor is for.

@cmnbroad Perfect, thank you!

samuelklee · 2018-01-04T18:34:21Z

src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/theano_hmm.py

@@ -0,0 +1,211 @@
+import numpy as np


See #4043. Let's also add some tests for the HMM.

Edited your issue.

samuelklee · 2018-01-04T18:36:27Z

src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/model_ploidy.py

+        return validated_contig_ploidy_prior_map, num_ploidy_states
+
+    @staticmethod
+    def get_contig_ploidy_prior_map_from_tsv_file(input_path: str,


This seems like it should be in io_ploidy.py.

samuelklee · 2018-01-04T18:40:00Z

src/main/python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/model_ploidy.py

+        return PloidyModelConfig(**relevant_kwargs)
+
+
+class PloidyWorkspace:


Just a thought: if we split off the inference package, it would be useful to have a simple model as a pedagogical example for developers. Such a model could highlight tricks like the sharing of tensors here.

I completely agree, and I think it is a good idea to do so. Will make an issue.

Oh nevermind, you already made it.

samuelklee · 2018-01-04T18:50:00Z

...python/org/broadinstitute/hellbender/tools/gcnv/gcnvkernel/models/model_denoising_calling.py

+        return fact * np.ones((self.denoising_model_config.max_bias_factors,), dtype=types.floatX)
+
+
+class DenoisingModel(GeneralizedContinuousModel):


Just wanted to say, this looks great!

Thank you!! ;-)

mbabadi · 2018-01-05T16:01:12Z

@samuelklee Thanks for the PR review, I have implemented most changes and will make issues for the remaining momentarily. Waiting for the travis tests to pass.

mbabadi · 2018-01-05T16:02:22Z

@sooheelee Thank you for the docs review! I applied all of the changes. Will coordinate with you regarding the tutorial and memory allocation tips after the release.

samuelklee · 2018-01-05T16:22:33Z

OK, just a few more minor doc comments from me. Also the issue about preemptible_attempts above. Otherwise good to rebase and merge! I will take care of trimming tests and reorganizing the packaging in another PR as we discussed.

samuelklee · 2018-01-05T16:26:11Z

@asmirnov239 @MartonKN Comment now (or hold your peace until after release!) We will merge soon otherwise.

droazen · 2018-01-05T16:56:50Z

@samuelklee Note that --javaOptions is now --java-options

samuelklee · 2018-01-05T17:50:24Z

Thanks for catching that @droazen, but I think we removed those for now.

samuelklee requested review from asmirnov239 and MartonKN December 11, 2017 15:59

samuelklee commented Dec 13, 2017

View reviewed changes

samuelklee force-pushed the sl_gcnv_ploidy_cli branch from 92ae1bc to 3c1ca91 Compare December 15, 2017 20:03

samuelklee force-pushed the sl_gcnv_ploidy_cli branch 7 times, most recently from 63f9f8a to 84d937a Compare January 2, 2018 16:22

samuelklee force-pushed the sl_gcnv_ploidy_cli branch from 84d937a to 6511ca3 Compare January 2, 2018 16:45

samuelklee assigned asmirnov239, MartonKN and samuelklee Jan 2, 2018

samuelklee mentioned this pull request Jan 2, 2018

Changed PreprocessIntervals behavior to not merge intervals that overlap after padding and updated docs. #4027

Merged

samuelklee requested a review from sooheelee January 3, 2018 14:34

samuelklee assigned sooheelee Jan 3, 2018

samuelklee mentioned this pull request Jan 3, 2018

Added documentation and kebab-case updates for ModelSegments pipeline (excluding PreprocessIntervals), along with some WDL updates. #4010

Merged

samuelklee commented Jan 3, 2018

View reviewed changes

samuelklee mentioned this pull request Jan 3, 2018

CNV WDLs don't use the gatk launcher script #3968

Closed

sooheelee suggested changes Jan 3, 2018

View reviewed changes

sooheelee assigned sooheelee and unassigned sooheelee Jan 3, 2018

samuelklee commented Jan 4, 2018

View reviewed changes

mbabadi force-pushed the sl_gcnv_ploidy_cli branch 2 times, most recently from 6c4594a to 3ee825d Compare January 5, 2018 17:47

The new gCNV (gcnvkernel, CLI tools, WDL)

99e5997

mbabadi force-pushed the sl_gcnv_ploidy_cli branch from 3ee825d to 99e5997 Compare January 5, 2018 18:10

samuelklee merged commit ecb6232 into master Jan 5, 2018

samuelklee deleted the sl_gcnv_ploidy_cli branch January 5, 2018 19:21

cmnbroad mentioned this pull request Jan 5, 2018

new Germline CNV wdl test are too slow, causing travis failures #4064

Closed

samuelklee mentioned this pull request Jan 6, 2018

gCNV python computational kernel #3838

Closed

3 tasks

cmnbroad mentioned this pull request Jan 7, 2018

gCNV python env fails on mac #4074

Closed

This was referenced Jan 10, 2018

Write integration test for GermlineCNVCaller for single sample calling #3002

Closed

More flexible integer copy number transition priors in gCNV #2998

Closed

		- werkzeug==0.12.2

		- "--editable=/gatk/src/main/python/org/broadinstitute/hellbender/tools/gcnv/"

		return "GC_CONTENT"


		class NameAnnotation(IntervalAnnotation):


		assert sys.version_info >= (3, 4), "gcnvkernel requires Python 3.4.x or later"

		VERSIONFILE="gcnvkernel/_version.py"

		return self.copy_number_update_reduced


		class CopyNumberEmissionSampler(Sampler):

		gcnvkernel.HybridInferenceParameters.expose_args(parser)


		def update_args_dict_from_exported_model(input_model_path: str,

		import theano.tensor as tt


		class PositiveFlatTop(PositiveContinuous):

		mu > 0, value >= 0, alpha > 0)


		def negative_binomial_gaussian_approx_logp(mu, alpha, value):

		return PloidyModelConfig(**relevant_kwargs)


		class PloidyWorkspace:

		return fact * np.ones((self.denoising_model_config.max_bias_factors,), dtype=types.floatX)


		class DenoisingModel(GeneralizedContinuousModel):

Added CLIs and WDL for python gCNV pipeline. #3925

Added CLIs and WDL for python gCNV pipeline. #3925

Conversation

samuelklee commented Dec 6, 2017

codecov-io commented Dec 6, 2017 • edited Loading

Codecov Report

mbabadi commented Dec 7, 2017

mbabadi commented Dec 8, 2017

samuelklee commented Dec 8, 2017 • edited Loading

samuelklee commented Dec 8, 2017 • edited Loading

samuelklee commented Dec 8, 2017

mbabadi commented Dec 8, 2017

mbabadi commented Dec 8, 2017

mbabadi commented Dec 8, 2017

mbabadi commented Dec 8, 2017

samuelklee commented Dec 8, 2017

Choose a reason for hiding this comment

samuelklee commented Dec 15, 2017

samuelklee commented Jan 2, 2018

samuelklee commented Jan 2, 2018

samuelklee commented Jan 3, 2018

sooheelee commented Jan 3, 2018

samuelklee left a comment • edited Loading

Choose a reason for hiding this comment

samuelklee Jan 3, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sooheelee left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sooheelee Jan 4, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sooheelee commented Jan 3, 2018

samuelklee commented Jan 3, 2018

samuelklee left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Dec 6, 2017 •

edited

Loading

samuelklee commented Dec 8, 2017 •

edited

Loading

samuelklee commented Dec 8, 2017 •

edited

Loading

samuelklee left a comment •

edited

Loading

samuelklee Jan 3, 2018 •

edited

Loading

sooheelee Jan 4, 2018 •

edited

Loading

samuelklee left a comment •

edited

Loading

cmnbroad Jan 5, 2018 •

edited

Loading